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ABSTRACT 


An  adaptive  recursive  digital  filter  is  presented  in 
which  feedback  and  feedforward  gains  are  adjusted  adaptively 
to  minimize  a least  square  performance  function  on  a sliding  win- 
dow averaging  process.  A two-dimensional  version  of  the  adap- 
tive filter  is  developed  and  its  performance  compared  with 
the  optimal  Wiener  filter.  The  filter  is  shown  to  be  effec- 
tive in  separating  three  diagonal  trajectory  streaks  from  a 
background  of  correlated  noise  added  to  white  noise.  Although 
the  recursive  adaptive  filter  approaches  the  optimal  Wiener 
filter  in  performance,  it  does  not  require  a priori  statistical 
knowledge  as  does  the  Wiener  filter  to  which  it  is  compared. 

The  results  indicate  that  the  recursive  adaptive  filter  "learns" 
the  statistics  and  adapts. 
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I . INTRODUCTION 


The  term  "filter"  is  often  applied  to  any  device  or 
system  that  processes  incoming  signals  or  other  data  in  such  a 
way  as  to  eliminate  noise,  to  smooth  signals,  to  identify  each 
signal  as  belonging  to  a particular  class,  or  to  predict  the 
input  signal  from  instant  to  instant.  There  is  an  atpundance 
of  literature  covering  the  theories  involved  under  the  head- 
ings of  estimation,  identification,  modeling,  prediction,  etc. 
The  usual  method  of  estimating  a signal  corrupted  by  noise  is 
to  pass  it  through  a filter  that  tends  to  suppress  the  noise 
while  leaving  the  signal  relatively  unchanged.  The  design  of 
such  filters  falls  in  the  domain  of  optimal  filtering,  which 
originated  with  the  pioneering  work  of  Wiener  [8]  and  was  ex- 
tended and  enhanced  by  the  work  of  Kalman-Bucy  [9]  and  others. 

Filters  used  for  the  above  purpose  can  be  fixed  or  adap- 
tive. The  design  o£  a fixed  filter  is  based  on  a priori  know- 
ledge of  both  signal  and  noise  statistics.  On  the  other  hand, 
adaptive  filters  have  the  ability  to  adjust  their  own  para- 
meters automatically,  and  their  design  requires  little  or 
no  a priori  knowledge  of  signal  or  noise  characteristics. 

This  work  presents  an  approach  to  signal  filtering  using  an 
adaptive  filter  that  is  in  some  sense  self-designing. 

The  adaptive  filter  described  here  bases  its  own  "design"  (its 
adjtistment  of  internal  parameters)  upon  estimated  (measured)  sta- 
tistical characteristics  of  input  and  output  signals. 
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The  statistics  are  not  measured  explicitly  and  then  used 
to  design  the  filter;  rather,  the  filter  design  is  accomplished 
in  a single  process  by  a recursive  algorithm  that  automatically 
updates  the  system  parameters  with  the  arrival  of  each  new  data 
sample.  It  is  assumed  that  the  input  and  the  output  at  the 
sampling  instants  are  the  only  measurable  quantities  of  the 
system.  It  is  also  assumed  that  the  unknown  filter  coefficients 
{)arameters)  to  be  designed  enter  linearly  in  the  difference 
equations  which  describes  the  self-designing  process. 

The  steepest  descent  method  is  employed  in  which  the  pre- 
vailing filter  parameter  vectors  are  perturbed  at  each  itera- 
tion in  a manner  so  as  to  decrease  a prescribed  functional 
(error  criterion  or  cost  function)  to  be  minimized.  The  steep- 
est descent  method  is  one  of  the  well  known  gradient  based 
algorithms. 

For  the  case  where  the  functional  being  minimized  is  the 
mean  square  error,  where  the  error  is  the  difference  between 
filter  output  signal  and  the  desired  signal,  the  filter  is 
called  the  least  mean  square  filter  (LMS  filter) . Various 
adaptive  algorithms  are  currently  available  depending  upon 
the  cost  function  and  the  method  used  to  minimize  cost  function. 

The  popularly  used  performance  criteria  are  the  least  mean 
square  criterion,  the  maximum  likelihood  ratio  (MLR)  criterion, 
and  the  maximum  signal  to  noise  ratio  (SNR)  criterion.  Here 
the  LMS  criterion  only  is  studied  and  the  steepest  descent 


method  is  employed.  Inevitable  errors  in  the  estimation  of 
the  statistics  prevent  the  adaptive  filter  from  delivering 
optimal  performance  since  the  adaptive  filter  is  not  based  on 
the  a priori  knowledge  of  statistics.  In  Chapter  II,  the  con- 
cept of  linear  stochastic  processes  is  reviewed  as  a preliminary 
study  for  this  thesis,  and  the  modeling  of  stochastic  processes 
is  studied.  These  can  be  considered  as  background  material 
for  the  following  chapters. 

In  Chapter  III,  the  concept  of  adaptive  filters  is  intro- 
duced and  the  structure  of  the  signal  and  the  mathematical 
model  of  the  processor  is  delivered.  The  algorithm  for  the  non- 
recursive adaptive  filter  by  Widrow  [1]  is  reviewed  and  the  new 
algorithm  for  the  recursive  adaptive  filter  is  developed  as  is 
the  two-dimensional  adaptive  filtering  process. 

In  Chapter  IV,  the  adaptive  noise  cancelling  concept  is 
analyzed  rather  qualitatively  and  its  application  to  the  special 
case  in  which  no  desired  signal  is  available  is  analyzed.  In 
Chapter  V an  experiment  is  performed  through  computer  simulation 
to  check  the  feasibility  of  algorithms  developed  in  the  pre- 
vious chapter  and  a comparison  with  the  optimal  Wiener  solution 
is  made.  In  Chapter  VI,  the  conclusions  are  presented  together 
with  a summary  of  the  experimental  results  and  suggestions  for 
further  research. 
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II.  LINEAR  STOCHASTIC  PROCESSES 


A.  INTRODUCTION 

The  problem  of  defining  a random  process  is  of  considerable 
importance  in  the  analysis  of  systems  subject  to  noise  distur- 
bance. Often  a partial  definition  of  the  process  will  suffice 
as  in  the  case  of  linear  least  mean  square  error  filtering, 
where  only  a knowledge  of  the  correlation  function  is  required. 
For  other  problems,  such  as  those  involving  nonlinear  filtering, 
more  complete  information  will  generally  be  needed.  A complete 
description  of  a random  process  requires  a knowledge  of  the 
distribution  functions  of  all  orders.  But  in  practice  few 
processes  apart  from  the  normal  and  Markov  are  defined  in  this 
manner.  For  the  purpose  of  analysis,  a model  to  generate  the 
random  process  is  desirable  and  for  a model  to  give  a complete 
description  of  the  process,  the  distribution  functions  should  be 
derivable  from  the  model.  While  both  continuous  and  discrete- 
time linear  process  may  be  defined,  only  the  discrete -time  case 
will  be  considered  here.  The  discrete -time  linear  process 
can  be  considered  to  be  the  result  of  the  digital  filtering  of 
a sequence  of  independent  and  identically  distributed  (IID) 
random  variables. 

The  linear  processes  are  important  since  they  are  inherently 
simple  in  terms  of  physical  considerations  and  form  a class 
which  includes  many  discrete  time  normal  random  processes. 

In  the  following  section  the  definition  and  properties  of  the 
linear  processes  are  summarized. 
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B.  DEFINITION  AND  PROPERTIES  OF  LINEAR  STOCHASTIC  PROCESSES 

It  has  been  found  useful  in  the  theory  of  stochastic  processes 
to  divide  stochastic  processes  into  two  broad  classes:  sta- 
tionary and  nonstationary.  Intuitively, a stationary  process 
is  one  whose  distribution  remains  the  same  as  time  progresses, 
because  the  random  mechanism  producing  the  process  is  not  chang- 
ing as  time  progresses.  A nonstationary  process  is  one  which 
is  not  stationary. 

Let  {x(i),  ieT}  be  a stochastic  process  with  finite  second 
moments.  Its  mean  value  sequence  , denoted  by  m(i) , is  defined 
for  all  i in  T by 

m(i)  = E [x{i)]  (2-1) 

and  its  covariance  kernel,  denoted  by  K (j.i),  is  defined 
for  all  j and  i in  T by 

K (j.i)  = Cov  [ X (j),  X (i)]  (2-2) 

An  index  set  T is  daid  to  be  a linear  index  set  if  it  has  the 
property  that  the  sum  i<-j  of  any  two  numbers  i and  j of  T also 
belongs  to  T.  Examples  of  such  index  sets  are  T = (1,2  , . . .}, 

T = {o,±l,  ±2,  . . . .}  , T={i;i>o}  and  T={i:-<»  < i < <»  } 

A stochastic  process  {x(i),  ieT  },  whose  index  set  T is  linear, 
is  said  to  be 

i)  strictly  stationary  of  order  k,  where k is  a given  positive 
integer,  if  any  k points  i,  i + 1,  ...i  + kin  T'*’,  where 
T"*"  - {x  (i)  , i^o},  and  any  j in  T"*",  the  k dimensional  random 
vectors  {x(i),  x (i+1).  . x (i+k) }and{  x(i+j),.  . . x(i+j+k)} 
are  identically  distributed. 
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ii)  strictly  stationary  if  for  any  integer  k,  it  is 
strictly  stationary  of  order  k. 

iii)  widesense  stationary  (covariance  stationary)  i-f  it 
possesses  finite  second  moments,  if  its  index  set  T is  linear, 
and  if  its  covariance  kernel  K (j,i)  is  a function  only  of  the 
absolute  difference  |j-i|,  in  the  sense  that  there  exists  a 
function  R(n)  such  that  for  all  j and  i in  t"*" 

K (j  ,i)  = Cov[x(j),  x(i)]  = (2-3) 

or  more  precisely,  I^j^(m)  has  the  property  for  every  i and  j in  Z'*’ 

Cov  [x (i) ,x (i+m)  ] = E [x(i)x(i+m) ] 

= Rxx("i)  ^2-4) 

We  call  I^j^(m)  the  covariance  function  (autocorrelation  function) 
of  wide  sense  stationary  time  series  {x(i),  ieT+}  . 

The  second  problem  concerns  the  concepts  of  ergodicity 
and  the  strong  law  of  large  numbers  in  terms  of  linear  processes. 
To  present  a complete  discussion  of  this  question  is  not  rea- 
sonable for  review  purposes,  but  it  is  interesting  to  con- 
sider certain  aspects  of  it.  For  strict  sense  stationary  pro- 
cesses, the  ergodic  theorem  is  the  strong  law  of  large  numbers 
and  states  that 

if {x(i)  ,ieT''’}  is  a strict  sense  stationary,  ergodic 

random  process  and  E [ |x  (o)  | ] <•», 
m 

then  l^g  ^ E x(i)=E[x(o) ]with  probability  1.  (2-5) 

m i=l 
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In  general,  a stochastic  process  is  said  to  be  ergodic  if  it 
has  the  property  that  Scunple  (or  time)  averages  formed  from 
an  observed  sequence  of  the  process  may  be  used  as  an  approxima- 
tion to  the  corresponding  ensemble  average. 

Stationarity  and  ergodicity  concepts  are  readily  extended  to 
two-dimensional  random  fiel<i  (for  two  dimensional  signal,  the 
term  random  field  is  preferred  to  random  process) . 

The  assumption  that  the  field  is  stationary  means  that 
the  statistics  of  a point  in  the  field  are  not  dependent  on  the 
location  of  the  point.  Then,  a stationary,  two-dimensional 

field  has  an  autocorrelation  function  defined  as: 


Rxx^n*,!!)^  E{x(lc,Jl)x(k+m,fi+n)  } (2-6) 

and  it  is  also  said  to  be  ergodic  if  the  statistical  (ensemble) 
average  of  random  field  x(k,l)  is  equal  to  the  spatial  averag- 
ing of  all  points.  That  means 


E(x(k,A)]  = <x> 


(2-7) 


where  <x>  by  definition  represents  spatial  averaging 
lim  1 1 MiMi 

M,  M2§  § 

2 


<x>= 


(k,il) 


(2-8) 


Now,  consider  a stationary  sequence  of  random  variables 
{x(i),  ieT}.  The  correlation  function  of  the  sequence  may 
be  written  in  the  form 


Iicx'"'= 


iwn  . 

e dF(w) 


— tr 


(2-9) 
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where  F(w)  is  a nondecreasing  function,  called  the  spectral 
distribution.  If  F(w)  is  absolutely  continuous  with  F (w)  = 
f(w)  almost  everywhere,  we  may  write 

f TT  iwn 

=1  e f(w)  dw  (2-10) 

00 

Under  certain  conditions,  (e.g.  E finite),  the 

n=0 

correlation  function  may  be  inverted  to  yield  the  spectral 
density  as 

f(w)=  ^ E (2-11) 

Tn  -» 

A sequence  of  random  variables{x (i) , ieT}  is  said  to  be  a 
process  of  moving  average  if  it  admits  the  mean  square  repre- 
sentation 

* ti)  h(i-j)  u{j)  (2-12) 

where{u  (j) , jeT}  is  a collection  of  orthornomal  random 
variables  (sequence  of  white  noise) . If  the  sequence  {h(i) , 
ieT}  is  one  sided  (ie,  h(i)  = 0,  i<0) , then  {x(i),  ieT}  is 
called  a one  sided  moving  average  process. 

A process  is  said  to  be  regular  if  the  error  for  pre*diction 
one  time  unit  ahead  is  nonzero.  It  is  known  that  a prc^ess  is 
one  of  moving  averages  if  and  only  if  its  spectral  distribution 
function  is  absolutely  continuous.  Furthermore,  a process  with 
an  absolutely  continuous  spectral  distribution  function  is 
regular  if  its  moving  average  representation  is  one  sided.  These 
facts  serve  to  motivate  the  following  definition  of  a linear 
process . 
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DEFINITION  [5] 

A linear  process  {x(i),i£T}  is  one  having  the  structure 
00  j 

x(j)  = Ih(i)u(j-i)  = Eh(j-i)u(i),  (2-13) 

0 — oo 

where  {u(i),  ieT}  is  a sequence  of  independent  and  identically 

distributed  (IID)  random  variables.  The  set  of  real  constants 

00 

{h(i) , ieT+}  is  such  that  E|h(i)|  «»,  and  the  function 

0 

00 

H(z)  = E h(i)  where  z is  a complex  variable,  is  analytic 

0 

and  has  no  poles  outside  the  unit  circle  in  the  z plane.  The 
correlation  function  of  this  process  is  given  by 


= § h(j)h(j+n)  (2-14) 

and  the  corresponding  spectral  density  f (w)  is 


f(w)  = - I?  h(j)e^^''l*  = - I H(e^'')|*  (2-15) 

2^  '0  2Tr' 

It  is  assumed  further  that  the  process  (2-13)  has  a rational 
spectral  density^  that  is 


f(w)  - J.  I H(e^'') 
2Tr' 


1 


B(e 


iw 


A(e 


iw 


L 2 
) 


(2-16) 


where  both  A(e^'^)  and  B(e^'^)  are  polynomials  in  e^'^  of 
finite  order  with  all  their  poles  inside  the  unit  circle. 

The  process  (2-13)  may  be  generated  by  passing  the  sequence 
{u(i),  ieT}  through  the  digital  filter  H(z).  By  the  assumption 
in  the  definition  of  the  linear  process  and  by  the  restrictions 
on  the  spectrum,  there  exists  an  inversion  D(z)  where 


In  general,  D(z)  and  H(z)  may  be  infinite  polynomial  in  z. 
D(z)  may  be  written  as  the  one-sided  sequence 


D(z)  = § d^z-i  (2-17) 

Passing  the  x(i)  sequences  through  the  digital  filter  will 
recover  the  generating  sequence  u(i) , that  is 

j 

u(j)  =_E  dlj-i)x(i)=  g dU)x(j-i)  (2-18) 

This  is  called  an  operation  inversion.  In  general,  we  will  have, 


and  the  process{x(i) , ieT}  may  be  represented  in  two  ways 


i)  x(j)  = g h(i)u(j-i) 

m . n 

ii)  x(j)  = z biu(j-i)-  raix(j-i) 
0 * 


(2-20) 


The  second  representation  indicates  that  if  H(z)  is  an  alL- 

pole  function,  then  we  have 
m 

iii)  E aix(j-i)  = u(j).  (2-21) 

0 

In  this  case,  since  inversion  uses  only  a finite  number  of  past 
Scimples,  the  process  is  called  "finitely  invertable."  It  is 
clear  that  finitely  invertible  linear  processes  form  a subclass 
of  autoregressive  schemes  for  which  case  the  set  {u(i)}  in 
(2-21)  would  be  orthonomal  rather  than  independent.  The  con- 
cepts and  definitions  above  can  be  readily  extended  to  a 
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two-dimensional  linear  process.  A two-dimensional  linear 
process  {x(m,n),  Tnez^,  nez2}  could  have  the  structure 
0000 

x(m,n)  * EZh(k,fi)u(k-m,5-n) 

00 

« ZZ  h(k-m,  jl-n)u(k,  i)  , (2-22) 

”00  u> 

where  {u(k,SL),  kcz^,  3.ez2}  is  a two-dimensional  sequence  of 
IID  random  variables  with  zero  mean  and  unit  variance.  The  set 
of  real  constant  {h(k.fi.)  , keZj^'*',  i.eZ2'*’}  is  such  that 

0000  - rt 

??|h(k,5)|<«  , and  the  function  H(Zj^,Z2)  = ggh(k,  jl) 

where  Zj^  and  Z2  are  complex  variables,  is  analytic.  The  correla- 
tion function  of  this  process  is  given  by 

R^^(in,n)  = ??h(k  Jjh(k+m,l+n)  (2-23) 

X*  0 0 

and  the  corresponding  spectral  density  f(w2^,W2)  is 


f (Wj^,W2) 


f?  hdc.JDe-"!" 

00 

ei"2)  1 ^ 


(2-24) 


With  the  same  reasoning  as  in  the  one-dimensional  case,  we  have 
i n general , 


H(Zjl,  Z2) 


M,  M, 
1+Z  Z 


b Z“^Z 
ij  1 2 

aij  Zj^”^  Z2“^ 


i=o  j=o 
(i,  j)?<(o,o) 


I I •'(k,)!) 


i. 


(2-25) 
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and  the  process  {x{k,l),  keZ^^,  leZ2}inay  be  represented  in  two 
ways 

i)  x(k,J,.)  * ??h(i.  j)u(k-i,  !L-j) 

00 

N1N2  , Ml  M2 

ii)  x(k,ji)  = ZZbjjU(k-i,ji-j)  - g g a^jX (k-i,JL-j ) (2-26) 

(i  /j)?^(o,o) 

It  should  be  noted  that  the  moving  average  scheme  would  be  ac- 
complished by  passing  the  orthonomal  IID  random  variables  through 
the  nonrecursive  filter  and  the  autoregressive  schemes  through 
the  recursive  filter  for  both  one-dimensional  and  two-dimensional 
random  process. 

In  the  study  of  systems  subject  to  the  random  signals,  the 
concept  of  power  spectral  density  function  is  of  importance. 

For  a given  transfer  function  of  a linear  filter,  the  cross  power 
spectrum  between  input  and  output  of  the  filter,  and  the  output 
power  spectrum  is  of  primary  concern. 

Consider  a (continuous)  system  subjected  to  a random  input 
signal.  Given  a linear  system  with  a transfer  function 
H(jw),  the  input  to  the  filter  a stationary  process  x(t) 
with  an  autocorrelation  function  ^ power  spectral 

density  function  G (w) , then  the  following  relationships 
are  obtained 

Gxy(w)  - Gj^jj(w)H*(jw) 

Gyy(w)  = Gjjy(w)H(jw)  (2r27) 

Combining  above  two  equations, 

Gyy(w)  » Gj^^(w)|H(jw)|2,  (2-28) 
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wh  ere  G (w)  is  the  cross  power  spectrum  and  G (w)  is  the 
xy  yy 

output  power  spectrum,  the  cross  correlation  would  be  cal- 
culated by  using  the  inverse  Fourier  transform  of  G (w) . 

xy 

For  the  continuous  two-dimensional  linear  system  of 
H(jWj^,jw2),  subjected  to  a stationary  two-dimensional  input 
signal  of  power  spectrum  W2)  . 

®xy^'^l'''2^  ~ ^xx^'^l'  W2)H( jwj^, jW2) 


GyyCWj^,  W2)  = W2)  H ( jWj^,  jw2)  (2-29) 


Again  combining  above  two  relationships,  the  output  power 
spectral  density  function  is 

Gyy(Wj^,  W2)  = W2)fH(jWj^,  jw2)(^  (2-30) 


Consider  a discrete  linear  system  to  which  a random  input  se- 
quence is  applied  with  a-  transfer  function  H(z).  The  input 
sequence  is  a stationary  process  x(i)  with  an  autocorrelation 
function  R^x^™^  itsj-transform  where  G^^(z)  is  equi- 

valent to  the  power  spectral  density  in  the  continuous  case,i.e. 

^ 3iRxx<”'>'  . <2-31) 

Then, 


°xy<=>  = <lxx<2>  «<^> 

G (z)  = Gj,jj(z)  H(z)H(z-l)  (2-32) 

For  the  two-dimensional  discrete  case, 

®xy^*l'  ^2^  * ^xx^^l'  ^2^ 

and  G y^Zj^,Z2)  * ®xx^^l'  22^  “ ^“'l' *2^  ® ^*1^ ' ^2^^  (2-33) 


19 


As  a summary  of  this  section,  a discrete-time  linearprocess  can  be 


considered  to  be  the  result  of  digital  filtering  an  indepen- 
dent identical  random  sequence  having  zero  mean  and  unit  variance. 
The  moving  average  scheme  is  the  result  of  filtering  through  the 
nonrecursive  filter  and  the  autoregressive  scheme  is  that  of 
filtering  through  the  recursive  filter. 

And  for  a linear  system,  the  relations  between  transfer 
function,  power  spectrvim,  and  auto  correlation  are  given  by 


°yy  = °xx  l»l 


G « G H 
xy  XX 


R (autocorrelation  function) 


^ transform  of  power 
spectral  density  function. 


The  Figure  2-1  shows  the  block  diagram  which  describes 
the  various  relations  and  concepts. 
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FIGURE  2 -1 
STATISTICAL 

PROPERTIES  OF  LINEAR  SYSTEM 


2] 


C.  MODELING  OF  LINEAR  STOCHASTIC  PROCESS 


1.  Introduction 

A more  active  concern  at  this  time  is  that  of  system 
modeling.  It  has  been  shown  in  a previous  section  that  a linear 
stochastic  process  (or  field)  can  be  generated  by  filtering 
white  noise  through  a linear  filter.  The  problem  can  be  stated 
as  follows.  What  is  the  filter  equation  (difference  equation) 
that  produces  a typical  random  process  with  a specified  autocorrelation 
function?  That  is,  with  the  knowledge  of  second -order  statistics, 
determine  the  filter  coefficient  a's  and  b's  in  equations  (2-20) 
and  (2-26) . It  is  clear  that  if  one  is  successful  in  developing 
a parametric  model  for  the  behavior  of  some  random  process, 
then  the  model  can  be  used  for  different  applications,  such  as 
prediction,  estimation,  smoothing,  etc..  As  far  as  the  general 
modeling  problem  goes,  one  of  the  most  powerful  models  currently 
in  use  is  that  where  a signal  x(n)  is  considered  to  be  the  out- 
put of  some  system  (filter)  with  unknown  input  u(n)  such  that 

the  following  relationship  holds 

P q 

x(n)  = - E ^,^x(n-k)  + §1^  b^u(n-£),  bo=l  (2-34) 

k—  1 

where  aj^,  l£  k<^  p , bi,  1 

and  the  gain  G are  the  parameters  of  the  hypothesized  system. 
Equation  (2-23)  says  that  the  "output"  x(n)  is  a linear  function 
of  past  outputs  and  present  and  past  inputs.  That  is,  the 
signal  x(n)  is  predictable  from  linear  combinations  of  past 
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outputs  and  inputs . For  Ihe  two-dimensional  case,  the  difference 
equation  corresponding  to  equation  (2-34)  would  be 

Ml  Li  Lj 

x(k,)l)  = - E I a.  .x(k-i,Jl-j)+G  Z Z b ^ ^"^2^ 

i=o  j=o  ^ ' 

where  (i,j)^(0,0)  (2-35) 

Equations  (2-34)aid(2-35)  can  also  be  specified  in  the  frequency 

domain  by  taking  the  z transform  of  both  sides  of  Eq  (2-34) and  Eq 

(2-35) . 


-Z 


_ X(Z)  _ p 1+ 

■ uW  " ® p 


(2-36) 


1+  Z a.  Z 
k=l  ^ 


-k 


where  X(Z)  = Z x(n)Z~” 
n=  — “ 

is  the  z transform  of  u(n) 


is  the  Z transform  of  x(n),  and  U(Z) 


For  the  two-dimensional  case. 


H ^2 


X(Z,,Z5)  il,=0£5=0 

H(Z3^,Z2)=; =G 


— i —5. 


(2-37) 


U(Zj^,Z2) 


M,  M, 
Z Z 


a.  .Z 


7 -J 


1 “2 
(i. j)/(o.o) 


where  X(Zj^,Z2)  = 

U(Zj,,Z2)  = 3[u(k,£)  ] 

H(Z)antS(Z2,Z2)  in  Eqs  ( 2-3 6)ani( 2-37)  are  the  general  pole  zero 
models. 


23 


The  roots  of  the  numerator  and  denominator  polynomials  are  the 
zeros  and  poles  of  the  model,  respectively.  There  are  two 
special  cases  of  the  model  that  are  of  interest, 
i)  all-  zero  model  = 0,  ^ij=0 
ii)  all- pole  model  0,  ~ 0 

1 < £ < q 

0 <_ 

0 £ £2<  L2 
But 

As  mentioned  in  section  II-B,  the  all- zero  model  is  known 
in  the  statistical  literature  as  the  moving  average  (MA)  model 
and  the  all^ole  model  is  then  known  as  the  autoregressive  (AR) 
model.  The  pole-zero  model  is  then  known  as  the  autoregressive 
moving  average  (ARMA)  model.  It  should  be  recalled  here  that  we 
are  interested  in  the  case  where  u(n)  or  u(kj2.)  is  white  noise,  and 
this  will  be  treated  as  a special  case  in  the  following.  The 
modeling  problem  can  be  stated  as"given  signal  x(n)  or  x(k,(l), 
find  the  filter  coefficients  (a's  and  b's)and  the  gain,  G, in 
Equation  (2-34)  in  some  manner."  Two  approaches  will  be  given  for 
a solution  of  the  above  problem.  The  first  is  the  method  of  least 
squares which  is  based  on  the  optimal  estimation  concept,  and  the 
second  isthefilter  response  method  in  which  linear  system  pro- 
perties are  used.  The  one-dimensional  case  will  be  treated 
first,  then  two-dimensional  case  including  some  examples.  For 
exeunple,  a lowpass  correlated  random  process  (field)  and  band 
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limited  random  process  (field)  are  chosen  since  they  represent 
practical  examples. 

2 . The  Method  of  Least  Squares 

Athaughthe following  can  be  applied  to  the  deterministic 
signal  and  stochastic  processes, stationary  or  nonstationary, 
it  is  emphasized  for  only  a stationary  random  process  and  only  the 
all-pole  model  is  considered  [6] . In  the  all- pole  model,  we 
ass\ime  that  the  signal  x(n)  is  given  by  as  a linear  combination 
of  past  values  and  some  input  u(n) : 

P 

x(n)  = - T.  AvX(n-k)  + Gu(n)  (2-38) 

k=l  ^ 

where  G is  a gain  factor. 

Here,  it  is  assumed  that  the  input  u(n)  is  totally  unknown, 

which  is  the  case  in  many  applications.  Therefore,  the 

signal  x(n)  can  be  predicted  only  approximately  from  a linearly 

weighted  summation  of  past  scimples. 

Let  this  approximation  of  x(n)  be  ^(n),  where 

x(n)  ■=  - y Avx(n-k)  , (2-39) 

k=l  ^ 

then  the  error  between  the  actual  value  x(n)  and  the  predicted 
value  x(n)  is  given  by 

p 

e(n)  = x(n)  - x(n)  = x(n)  + E (2-40) 

k=l  ^ 

e(n)  is  also  known  as  the  residual.  In  the  method  of  least 
squares,  the  parameter  Aj^'s  are  obtained  as  a result  of  the 
minimization  of  the  mean  square  error  with  respect  to  all  of 
the  pareuneters. 


25 


If  the  signal  x(n)  is  assumed  to  be  a sample  of  a random 
process,  then  the  error  in  equation  (2-40)  is  also  a sample 
of  a random  process.  In  the  least  square  method,  we  minimize 
the  expected  value  of  the  square  of  the  error. 


E[e^(n)]  =E{[x(n)  + E A^xCn-k)]^} 

k=l  ^ 

2 

E[e  (n) ] is  minimized  by  setting 


(2-41) 


3E[e^(n)] 

Mi 


= 0,  1 - i - p 


(2-42) 


From  (2-41)  and  (2-42)  we  obtain  the  set  of  equations 
P . 

Z Ak  B|]tx(n-k)x(n-i)  ] = -E  [x (n) x (n-i)  ] 1-i-P  (2-43) 

k=l 

Then  the  minimum  average  error  is  given  by 

= E[x^(n)]  + f AkE  (x  (n)  X (n-k)  ) (2-44) 

k=l 

For  a stationary  process  x(n),  we  have 
E[x(n-k)x(n-i) ] = 

where  is  the  autocorrelation  of  the  process. 

Note  that  equations  (2-42)  and  (2-44)  lead  to  the  well  known 
orthogonality  principle  [7] . Since  in  the  least  squares 
method,  we  assumed  that  the  input  is  unknown,  it  doesn't  make 
much  sense  to  determine  a value  for  the  gain  G.  However, 
there  are  certain  interesting  points  that  can  be  made. 
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Equation  (2-39)  can  be  written  as 


x(n)  = - ? A^x(n-k)  + e(n)  (2-45) 

k=l  ^ 

Comparing  (2-45)  and  (2-38) , it  is  seen  that  the  only  input 
signal  u(n)  that  will  result  in  the  signal  x(n)  as  output  is 
that  where  Gu(n)  = e(n).  (2-46) 

That  is,  the  input  signal  is  proportional  to  the  error  signal. 
e(n)  can  be  also  considered  as  the  modeling  error.  The  error 
variance  can  be  calculated  by  Equation  (2-41)  and  the  filter 
coefficient  Aj^  (k=l  . . . p)  would  be  calculated  by  equation 
(2-43)  if  the  correlation  function  of  process  x(n)  is  available. 

At  this  moment,  it  should  be  recalled  that  a linear  random 
process  is  generated  by  linear  filtering  of  white  noise.  Therefore, 
we  are  interested  in  white  noise  inputs  for  the  purpose  of 
modeling  a given  linear  random  process.  That  is,  the 

input  u(n)  is  assumed  to  be  a sequence  of  uncorrelated  scunples 
(white  noise)  with  zero  mean  and  unit  variance. 

E[u(n)]  = 0,  E[u(n)u(m)]  = 4nm 

Then  the  output  x(n)  forms  a stationary  random  process 

x(n)  = - ? AvX(n-k)  + Gu(n)  (2-47) 

k=l  ^ 

Multiplying  equation  (2-34)  by  x(n-i)  and  taking  the  expectation, 

E[x(n)x(n-j)  ] = - j Aj^E  [x  (n-k)  X (n-i)  ] 

k*l 

+E(Gu(n)x(n-i)]  (2-48) 

Noting  that  u(n)  and  x(n-i)  are  uncorrelated  for  i>0  and  re- 
calling that  for  stationary  process,  Elx(n)x(n-i)  ]*Rj^j^(i)  , 


Equation  (2-35)  turns  out  to  be 


= - E A^.R^^(i-k)for  p ^ i>  o 


k=l 


'k*xx 


(2-49) 


and  R(o)  can  be  obtained  by  plugging  x(n)  of  Equation 
(2-38) into  Equation  (2-48) 

n 

.2 


= - .5, 


(2-50) 


Therefore,  the  gain  can  be  given  by 

=R  (o)  + ^ . 

k=l  ^ 


(2-51) 


It  is  noted  that  through  Equation  (2-46),  that  is  Gu(n)=e(n), 

the  white  noise  input  of  zero  mean  and  unit  variance  generates 

the  random  process  e(n) , which  is  again  white  with  zero  mean 

2 

and  variance  of  G . Therefore,  from  Equation  (2-49) , the  re- 
cursive filter  coefficients  Aj^,  (k=l,  ....p)  can  be  calculated 
and  using  these  values  the  gain  G would  be  calculated  by  Equation 
(2-51)  with  the  knowledge  of  autocorrelation  function  of  a given 
class  of  linear  random  process.  So  far,  modeling  of  one- 
dimensional stochastic  process  has  been  considered.  Similar 
reasoning  can  readily  be  extended  to  the  modeling  of  two- 
dimensional  random  fields.  Again,  the  two-dimensional  all-pole 
model  is  considered. 

Ml  M2 

x(k,jl)=-2;  E A.  .x(k-i,il-j)+Gu(k,£)  (2-52) 

i=o  j=o 
(i#j)^(o,o) 
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Let's  define  the  set  fi(k,Jl)  such  that  for  all  i,j 
(k-i,  2.-j)  en(k,)l), 

all  the  values  of  x(k,jl)  in  n(k,)l)  will  be  used  to  estimate 
the  point  x (k, X.)  . 


figure  2-2 

DEFINITION  OF  fi(k,Jl) 
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Again  the  coefficients  A^^j  will  be  determined  such  that  the 

mean  squared  error  is  minimized.  The  estimate  of  x(ktl) , 

A 

x(k(£),  is  given  by  a linear  combination  of  the  previous  values. 
A 

x(k,A)  = - IjjZx(k-i,2-j)  (2-53) 

The  mean  squared  error  is 

E[e^(k,jl)]=E{[x(k,£)-  i(k,£)]^}  (2-54) 

If  E[e^(k,£)]  is  minimized,  A(k,£)  is  the  "linear 
least  squares  estimate"  of  x(k,£) . 

Going  through  the  same  procedure  as  in  the  one-dimensional 
case,  that  is,  substituting  (2-53)  in  (2-54)  and  differentiating 
with  respect  to  each  A^^ , setting  derivatives  equal  to  zero,  we 
obtain  the  following  set  of  simultaneous  equations  for  the 
unknown  A^j 

e{ [x(k,£)-x(k,£) ]x(i, j) } = o for  allfij)en,  (2-55) 
which  says  that  the  coefficients  A^j  must  be  such  that  the 
estimation  error  [x (k,£) (k,£) ] is  statistically  orthogonal  to 
each  x(i,j)  that  is  used  to  form  the  linear  estimate.  This 
is  known  as  the  orthogonality  principle  in  linear  least  square 
estimation. 

The  modeling  error  is  the  difference  between  the  true 

value  x(k,£)  and  the  estimate  ^(k,£).  By  definition, 

A 

e(k,£)=x(k,£)-x(k,£)=Gu(k,£)  (2-56) 

from  the  equation  (2-52)  and  (2-53) . 

Again,  we  are  interested  jnthevhite  noise  field  of  zero 
mean  and  unit  variance.  Then  the  modeling  error  is  also  a 
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random  field. 


Rewriting  the  Equation  (2~52)  in  terms  of  the  error  e(k,£)  gives 
x(k,£)=-ZjjE  A^jX(k-i,£-j)+e(k,£)  (2-57) 

To  calculate  the  error  variance,  multiply  (2-57)  by  x(k-m,£-n) 
and  take  the  expectations 

E[x(k,Jl)x(k-m,£-n)  ]=-  Z^ZAj^jE  [x(k-i,£-j)x(k-m,£-n)  ] 

+E [e(k,£)x(k-m,£-n) ] (2-58) 

For  the  stationary  process, 

E[x(k-i,Jl-j)x(k-m,£-n)  ]-l^^-in-j]  , then  (2-58)  will  be 
I^3^rtyi)=-Zj^ZA^jI^j,(jm-i,n-j)  for  all(m,  n)efi.  (2-59) 

The  second  term  on  the  jdghtside  of  (2-58)  is  zero,  because  of  the 
orthogonality  principle  and  R(0,0)  can  be  obtained  by  the 
following  equation: 

I^t,0)=-Z^Z  AijI^3jCi,j)+G^  (2-60) 

Therefore  the  modeling  error  variance, 

E [e^ (k,£) ] =E{ ( (Gu (k,£) ] ^}  = G^  is  given  by  the  equation: 

G^=E  [e^  (k,£) ) =^4),  0) +ZjjZ  A^jl^jj(i,j)  and  the  mean  of  error 
e(k,£)  is  E[e(k,£)  ]=E[Gu(k,£)  ]=0 

Again  with  the  knowledge  of  autocorrelation  function  of  the 
two-dimensional  stationary  random  field,  the  filter  coefficient 
A^j  can  be  obtained  from  the  Equation  (2-59)  and  using  these 
values  of  Aj|^ j , the  gain  G in  Equation  (2-52)  can  be  calculated 
by  Equation  (2-60) . 

Example  1 Consider  a one-dimensional  stationary  band  limited 
random  process  fiarvhichtheautocorrelation  function  is  given  by 
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Ryjj(m)  = cos(Woin) 

Find  a model  which  describes  this  process.  An 
all-pole  model  is  chosen  such  that 
P 

x(n)  = - I Ai.x  (n-k) +Gu  (n) 
k=l 

where  E[e(n)]  = 0 

E [u(n)  u (m)  ] = ^mn 

One  has  to  choose  the crder of  the  difference  equation;  here,asecond 
order  model  (p=2)  is  chosen.  Then 
X (n)  (n-1)  -A2X  (n-2)  +Gu  (n) 

The  problem  shrinks  down  to  that  of  finding  A^,  A2»  G with  the 
given  (m)  . 

From  Equation  (2-4  9) 

k»l 

IjJil)  = - Ai^o)  - 

5^2)  = - Ai5i(l)  - Ajiy^O) 

Putting  this  in  matrix  form 


^1 

lyl) 

^-^1) 

. ''^2> 

= 1 m2)} 

From  Rxx  (m)  = 

^iml 

cos  (w  m) 

O 

/ N 

f s 

-1  - p cos  Wo 

^1 

P cos  Wo 

- f>C03  w^  -1  ^ 

s 

p2  cos  2 Wo 

Therefore  p oox  w.  a-^  cox  2 w.) 

1 - P COS  Wo 


32 


A2  = 


P^sin^w, 


1 - p^cos^  Wo 


From  Equation  (2-51) 

.2 I 


= R (0)  + 

w'  ' 


XX 


k=l 


= ^ A,P_(2) 


XX 


= 1 + Aj^p  cos  Wo+  A2P^cos  2 Wc 
_ _ /^2cos2wo  (i-p2cos  2 w<>) 

1 - p^cos^Wo 


^ p^sin^wpcos  2 Wp 


1 - P^cos^ 

Example  2 Given a Stationary  two-dimensional  random  field  with 


Rxx(m,n)  = 


p|m|p|n| 


find  the  autoregressive  model  of  this  random  field  (most  mono- 
chromatic images  can  be  assumed  to  have  this  form  of  auto- 

correlation function). A first-order  model  (Mj=  1,  Mj  = 1)  is 
chosen.  Then  equation  (2-52)  can  be  written  as  follows: 
x(k,£)  = w Aj^QX(k-l,  Jl) -Aj^j^x(k-1,  £-1) -Aqj^x (k, J!,“^)  +Gu(k,)i) 

Where  u(k,J!,)  is  white  noise  with  zero  mean  and  unit  variance,. 


Then,  using  Equation  (2-59) 

1 1 

I^m,n)=-  Aj^j^m-i,n-j)  for  all(m,  n £) 

(i,j)5<(o.0) 

Rxx(l,0)  = - Aj^qR^^(0,0)  - -1)-Aqj^Rxx(1,-1) 

Rxx(l.rl)  = - Aj^qR^^(0,1)  - Aj^j^R^^(O.O)  -Aqj^Rxx(1«  0) 
Rxx(0,l)  = - A^oR^^(-1,1)-  AiiR^^(-1,0)-Aq^R^^(0,  0). 
Putting  thisin  matrix  form  gives 
-Rxx(0,0)  -Rxx(0,-1)  -Rxx(l,-1) 

-Rxx(0,-1)  -Rxx(0,  0)  -Rxx(l,0) 


V. 

^ N 

^10 

r N 

Rxx(l,0); 

^11 

= 

Rxx  (1  vD 

/ 

/ 

> 

O 

Rxx (0,1) 
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For  thegiven  auto  correlation  function  above 


-1  -f> 

/ \ 
^10 

-f  -1 

-p 

Hi 

s 

P2 

-f 

-1 X 

. ■^01, 

P 

vMch  yields 


'01 

^11 


= -P 
= P^ 


^10  P 

2 

The  modeling  error  variance  or  the  square  of  tie  gainG  can  be  cal- 
culated by  Equation  (2-60) 

2 1 1 
G _ Z Z A..  R (i,j) 

i=0  j=0  ■’ 

(irj)?^(0,0 

= 1 + Aqj^Rj^jj(0,1)  + (1»0) 


= (1-f^)^ 


Therefore,  the  complete  model  for  Rxx  = pJmJ^lnl 

x(k,)l)  = px(k,Jl-l)  + /J*x(k,  £)  + px(k-l,Jl)  + (l-f^u(k,il) 
where  E (u(k.J!.)  ) = 0 

E§u(k.Jl)  u(k-p,£-q)]  = ^.pq. 

3 . Method  of  Filter  Response 

Another  method  of  modeling  a linear  stationary  random 
process  is  based  on  the  concept  thatalinear  random  process  is  a 
result  of  filtering  white  noise  through  a linear  filter.  In 
Section  B.  of  this  chapter,  the  properties  of  linear  systems 
have  been  discussed. 
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For  the  discrete  system,  it  is  known  that  the  filter  output 
power  spectral  density  is  the  Z transformation  of  the  output  correla- 
tion function.  That  is, 

Gyy(Z)  = 3[R.yy(m)] 

Gyy(Zi,  Z2)  = 3 [Ryy(m,n)] 
and  also  it  is  noted  that 

Gyy  (Z)  = Gxx(Z)  H (Z)  H (z"^) 

Gyy(Zj,  z^)  = Gxx(z^,  z^)  h (z^,  z^)  h (z^  , z^  ), 

where  Gxx(Z),  Gxx(Zi,  Z2)is  the  input  power  spectral  density 
function  and  H{Z),  H (Zi,  Z2)  is  the  transfer  function  of  the 
filter.  Denoting  the  white  noise  input  as  u (n)ar u (k, £)  and 
the  output  of  filter,  which  is  a linear  stationary  random  pro-  • 
cess  we  are  going  to  model  as  x (n)or  X (k, Jl)  , the  problem 
can  be  stated  as  follows:  For  a given  autocorrelation  function 
of  alinear  random  process  (field)  R(m)  , (R  (m,n)),  find  a linear 
system  such  that  when  the  input  is  white  noise,  the  output  of  the 
filter  gives  a given  autocorrelation  function  R(m) , (R(m,n)). 


u(n) 

H(Z) 

x(n) 

? 

u(k,t) 

H(Zi,  Za) 

x(k,t) 

Given  Pjm(a),  R^{iB,n) 


naasB  2-3 

THE  MODELING  PROBLEM  IN  THE  DISCRETE  CASE 
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If  the  input  is  white  noise,  then 
Guu(Z)  = Const, 

Z2^  = const. 

Therefore,  the  solution  for  the  required  filter  is  to  find  a 
function  H(Z)  or  H(Zj^,  Z2)  which  satisfies 
Gj^jj(Z)  = C0nst-H(Z)H(Z"^) 

^xx^^l'  ^2^  " Const-HCZj^,  Z2)H  (Z^^,  Z2"^)  (2-62) 

where  G (Z) , G (Z, , Z-)  is  known  through  the  relation 

XX  XX  X 

Sxx<2>  ’ 5 I'Scx‘"»' 

°xx<2l'  22>  '3tR*xte.nll 
since  given. 

Botfiarthe  two-dimensional  case,  there  is  an  inherent  difficulty 
in  factorization  of  ^2^  H(Zj^,  Z2)H(Z^^  , 22”^). 

Therefore,  only  separable  functions  can  be  modeled  by  this 
technique . 

Kxeimple  1 For  a given  stationary  linear  random  process  with 
Rxx(n)  = er2p(m(  m)  ^ m=o,  1,  2 . . . 

Find  a difference  equation  which  will  give  a random  process 
with  autocorrelation  above. 


°xx''> 


JCRxxC”)) 

®lLiLl^^ijdLl2_££2jLJlaXLLt^fi.l5ZL_£22LJij^ 

(l-2|0Zcos  Wo+P^Z^)  (l-2/oz“^cos  Wo+/O^Z"^) 


(2-51) 


The  second  step  is  to  find  a factored  expression  for  Gj^j^(Z)  ,i.e. 


G^^Z)  = Gi(Z) 


Gj^(Z"^) 
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Assxuning  that  has  the  form 


az~^  + b 


az  + b 


(2-52) 


l-2^Z~^cos  Wo+p^z~^  l-2pz  cos  Wo+P^z^ 


Comparing  (2-52)  and  (2-51) , a and  b can  be  obtained  as 
a = j ( 1-p  cos  Wo+P^  + 1+P  cos  Wo+P^ 

.2 


1 2 
b = -j  1-p  cos  Wo+P 


1+  P cos  Wo+p 


.=1 


From  equation  (2-50),  if  we  choose  H(Z)=  az  + b 

n 2"-2 

l-2pz  cos  Wo+P  2 

2 2 

then  the  term  (1-p  ) in  Equation  (2-52)  can  be  considered 

as  G (Z) 
uu 


Therefore 


(1-p^)  if  m=0 


0 if  m?^0 

and  the  octrplete  model  is  drawn  block  diagram  form  in  Figirre  2-4. 


u{n) 

x(n)  has  given 

autocorrelation 

White  NoIm 

H(z)-  — ^57:2 

with 

l~2pz  COM  ^ 

x(z) 

▼arlence 

-(l_p2)  2 

FIGUBE  2-4 


FILTER  FOR  ONE-DIMENSIONAL  BAND  PASS  PROCESS 
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■EsFuttiE  system's  input-output  relation  into  state  equation 
form,  it  is  defined  for  convenience  as 
u(z)  = z~^  w (z) 

vdsre  w(n)  is  also  white  noise  with  same  autocorrelation 
function 

R(m)  = (l-P^)^^  if  m=0 

WW  < 

0 if  mj^O 


then  the  transfer  function  is 

X(Z) 


H(Z)  = 
D ef ining 

Xj^(z) 

X2(z) 


az“^  + b 


X(z) 


"uTYT 


= -p^z"^X2  (z) 

= U(z) 

? -2 

l-2p  cos  woz  +p  z 
= (az  ^+b)X2(z) 


l-2pZ~^cos  Wo+p^z”^ 


then 


and 


Xj^(n)  = -p  X2(n-1) 

X2(n)  = Xj^(n-l)  + 2p  cos  WoX2(n-l)  + w(n-l) 
X(n)  = ax2 (n-l)+bX2 (n)  = -p“^ax (n) +bx2 (n) 


Putting  the  state  and  output  equations  in  matrix  form  gives 

2 

0 - 

1 2 COSWo 


(n) 

s. 

II 

^^(n) 

1 

Xj^  (n-lT 

+ 

/ N 
0 

X, (n-1 

\ ^ ) 

.1  > 

-2 


x(n)  = (-p  a b) 


Xj^(n) 

X2(n) 
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w(n-l) 


Excimple  2 The  two-dimensional  "band  limited"  discrete 


Markov  process  is  defined  by  the  autocorrelation  function 

(m,n)  cos  (w,m)]  [^3  cos  (w2n)  ] then 

the  discrete  power  spectral  density 


B(Z^,  Z2) 

where 


A(Zj^,  Z2)  = [-Zj^CosWj^+(l+^j^^)-Z^^  CosWj^]  [-Z2CosW2+(l+/l2^)  "Z^^Cos  W2] 
Z2)  = [ (l-2p’j^Zj^cos  w^+jO^z^)  (l-2/:>j^z"^cos 

[{1-2^222^°^  '^2'^P2^2^ 


Putting  A(Zj^,  Z2)  in  the  following  form 

A(Zj^,  Z2)  = [ (aj,z“^+bj^)  (a2Z2+b^)  ] [ (a2Z2^  +b2)  (a2Z2+b2)  ] 


and  comparing  this  equation  and  above  A(Z2^Z2)j  a2/  b^^  , b2 

obtains 
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then  from  (2-50) 


R (m,n) 
uu  ' 


if  n=o,  m=0 
if  n^O,  m^^O 


KtZi  ,Zz  ) 

Rxx^"’'">  ml 

--6‘"rf’;’"cosvJ,ni)(f^  a)SU»n) 

FIGURE  2-5 

FILTER  TO  GENERATE  BAND  PASS  RANDOM  FIELD 

It  is  convenient  to  define . 

U(Zj^  , Z2)  = Zj^"^Z2"^W(2j^,  Z2) 

W(k,Jl)  is  also  white  noise  with  the  same  statistics  as  w(k,)!,). 

0 

Then,  the  filter  has  the  form 

X(Zj^,  Z2)  ^ (aj^z^^+b)  ( a2Z2"^+  b2)  Zi“^Z2~^ 

W(Z,,  Z2T  ' ' ~i  . 2 .2* 

(l-2pj^Zj^'‘'cos  (l-2p2Z2'‘'cos  W2+p2^2^^ 

The  following  definitions  are  made: 


Rww("'"')  = 


if  n=0  and  m=0 
if  n^O  or  m?^0 


0(Zj 

White  Noise' 
of^variance 


ot.variance  J 
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N2(Zi,  Z2) 


(Zj^ , Z2) 


= -Pi^^rS'^r  ^2' 

Z,-lw(Z  , Z ) 

-LT--^  2 J_ 

cos  + Pi 

= + bj^)N2(Z2^>  ^2^ 


From  the  last  three  definitions  one  can  write  the  set  of 


difference  equations 


N3^(k,Jl) 

N2(k,i) 


0 -t 


Nj^(k,£-l)  + 0 


1 2;Pj^cos  w N2(k,£-1) 


w(k, £-1) 


Xj^(k,£)  = ^j^^aj^Nj^(k,£)+bj^N2{k,£) 

Now,  additional  definitions  are  made. 
Mi(Zi,  Z2)  = -f2^2~^^2^^1'  ^2^ 

M2(Z^,  Z2)  = Zj  (2^/  Z2) 

^”^/2^2^  cos  W2  +y^2^^2 
From  these  definitions  it  follows  that 


X(Zj^,  Z2)  = (a222  + ^2)  M2  (Zj^,  Z2) 


(2-55) 


Then 

r..  . 


Mj^(k,£) 


M2(k,£)^ 


0 


Mj^(k'l,£) 


1 2f,cos  w_  M (k-l,£) 


Xj^(k-1,£) 


X(k,£)  * -p2  b2M2(k,£)  (2-56) 

Combining  these  all  together  , the  following  form  is  obtained. 
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r 


^Mj^(k+l,£r 

0 

0 

0 " 

Mj^(k,£)' 

r N 
0 

M2(k+1,£) 

1 

2P2COS  W2 

-?h 

1 h 

M2(k,£) 

t 

0 

N3^(k,£+1) 

0 

0 

0 

-Pi' 

N^(k,£) 

0 

N, (k,£+l) 

0 

V. 

0 

1 

2PjCOS 

N,(k,£) 
k 2 y 

.1  , 

and 

x(k,A) 


0 0 


] ( Mj^{k,£) 

M2(k,£) 

Nj^(k,£) 

N_(k,£) 
^ / 


TVjo methDdsof  modeling  have  been  discussed.  Some  of  the  above 
examples  will  be  used  in  later  chapters.  It  should  be  noted 
again  that  the  filter  response  method  cannot  be  used  for  the 
case  where  the  autocorrelation  function  is  nonseparable. 
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III.  ADAPTIVE  FILTERS 


A.  NONRECURSIVE  FILTER 
1.  Introduction 

Many  forms  of  adaptive  filters  have  been  described  in 
the  literature,  some  of  which  have  been  shown  to  be  optimal 
(or  suboptimal)  in  certain  applications.  The  special  form 
of  an  adaptive  nonrecursive  filter  developed  by  Widrow  [1] 

is  reviewed  here  to  give  some  insights  to  the  recursive  adap- 
tive filter  developed  in  next  section. 

The  filter  to  be  considered  here  consists  of  a tapped 
delay  line,  variable  weights  whose  input  signals  are  the  signals 
at  the  delay  line  taps,  a summer  to  add  the  weighted  signals, 
and  machinery  to  adjust  the  weights  automatically.  The  impulse 
response  of  such  a discrete  system  is  completely  controlled 
by  the  weights.  The  adaptation  process  automatically  seeks  an 
optimal  filter  impulse  response  by  adjusting  the  weights. 

Two  kinds  of  processes  take  place  in  the  adaptive  filter: 
training  and  operating.  The  training  (adaptation)  process  is 
concerned  with  adjusting  the  weights, and  the  operating  process 
consists  in  forming  output  signals  by  weighting  the  delay  line 
tap  signals,  using  the  weights  resulting  from  the  training 
process.  During  the  training  process,  an  additional  input 
signal,  "the  reference  (or  desired)  response,"  must  be  supplied 
to  the  adaptive  filter  along  with  the  usual  input  signals. 
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This  requirement  may  in  some  case  restrict  the  use  of 
this  particular  form  of  adaptive  filter.  An  example  illustrat- 
ing the  use  of  the  desired  signal  is  the  case  of  modelling  an 
unknown  system  by  a discrete  adaptive  filter  as  shown  in  Figure 
3-1.  Here  a discrete  input  signal  x(n)  is  applied  to  an  un- 
known system  to  be  modeled.  The  discrete  adaptive  model  is 
supplied  with  an  input  x(n).  The  output  of  the  unknown  system 
d(n)  is  compared  with  the  output  y(n)  of  the  adaptive  system. 
This  system  can  self-adapt  to  minimize  the  mean  square  error, 
(throughout  this  thesis,  the  mean  square  error  is  chosen 

as  the  performance  measure),  where  theerror  is  defined  as  the 
difference  between  the  output  of  the  adaptive  model  and  the 
desired  signal  (for  this  probler.  the  desired  signal  is  the  output 
of  the  unknown  system  to  be  modeled) . 
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Then 


(3-1) 


N 

y(n)  = w^x  (n-i) 

e{n)  = y(n)  - d(n)  (3-2) 

Noting  that  equation  (3-1)  is  the  convolution  sunmation,  the 
sequence  of  weights{wi}  can  be  seen  asthe  impulse  response 
of  the  adaptive  syston. 

2 

The  weights  are  adjusted  to  minimize  E(e  ) . 

It  will  be  shown  that  if  the  input  and  output  signals  of  the 
system  being  models  are  stationary,  the  error  signal  has  a 
mean  square  value  which  is  a quadratic  function  of  the  weight 
settings. 

For  the  minimization  of  mean  square  error,  the  steepest 
descent  method  is  used.  Throughout  this  thesis  , the  terms 

"filter  coefficient  updating  process"  aind  "adaptation  process"  are 
used  interchangeably  and  it  is  asstimed  that  the  input  to  the 
adaptive  system  and  the  desired  signal are  stationary  random 
processes  (cc randan  fields  for  the  two-dimensional  case)  . 

2 . Performance  Surface,  Gradient  and  the  Wiener  Solution. 

The  input  signals  are  weighted  and  summed  to  form  an 
output  signal  [Equation  (3-1) ] . Introducing  the  vector  notation 
such  that  = [Wq,  Wj^,  W2  . . . w^j] 

^(J)  = [2«(n)/  x(n-l),  x(n-2)  . . . x(n->l)]  (3-3) 
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Then  Equation  (3-1),  which  describes  the  linear  combination 
(operating  process),  can  be  written  in  matrix  form. 
y(n)  = TFx  = x^V?  (3-4) 


and  e (n)  = y(n)  - d(n) 

= W^x  - d (n) 

T.he  square  of  this  error  is 

e^  (n)  = - 2d(n)  tv^x  + d^  (n) 


(3-5) 


(3-6) 


the  mean  square  error,  the  expected  value  of  e^  (n) , is 

- 2 , _ I 1 _ ^2  / _ . L tbT 


W 


(3-7) 


E [e^  (n)  ] = d^  (n)  - 2W"l^  + 

where  the  vector  of  cross-correlation  between  the  input  signals 
and  the  desired  response  is  defined  as 


E[d(n)  "X  (n)  ] = E 


d(n)x(n) 

d(n)x(n-l) 


d (n)  X (n-N) 


'xd 


(3-8) 


and  where  the  correlation  matrix  of  the  input  signal  is  defined 


as 


E[X(n)x'^(n)  ] = E 


(n-N)  1 


X(n)X(n)  X(n)X(n-l)  . . . X(n)X( 
X(n-l)x(n)X(n-l)X(n-l)  . . .X(n-3)X(n-N) 

X (n-N) X (n-N 
(3-9) 


A 

=R. 


XX 
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It  may  be  observed  from  (3-7)  that  for  stationary  input  signals, 
the  mean-square  error  is  precisely  a second  order  function  of 
the  weights.  The  mean  square  error  performance  function  may  be 
visualized  as  a bowl  shaped  surface,  a quadratic  function  of  the 
weight  variables.  Then  the  adaptive  process  has  the  job  of  con- 
tinually seeking  the  "bottom  of  the  bowl."  A means  of  accomplish- 
ing this  by  the  well-known  method  of  steepest  descent  is  dis- 
cussed below. 

In  the  non- stationary  case,  the  bottom  of  the  bowl  may  be 
moving,  and  the  orientation  and  curvature  of  the  bowl  may  be 
changing.  The  adaptive  process  has  to  track  the  bottom  of  the 
bowl  when  inputs  are  nonstationary.  The  method  of  steepest 
descent  uses  the  gradient  of  the  performance  surface  in  seeking  its 
minimum.  The  gradient  at  any  point  on  the  performance  surface 
may  be  obtained  by  differentiating  the  mean-square  error  function 
of  Equation  (3-5)  with  respect  to  the  weight  vector.  The 
gradient  is 


V{E(e^(n))}*  -2  + 2 W 


XX 


(3-10) 


To  find  the  "optimal"  weight  vector  Wj^g,  i.e.  the  one  that  yields  tie 
least  mean  square  error,  set  the  gradient  to  zero.  Accordingly, 


*Scd  ~ ^x  '^LMS 


^LMS~  ^xx  ^xd 


(3-11) 


Equation  (3-11)  is  known  as  the  Wiener-Hopf  equation  in 
matrix  form. 
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Then  the  minimum  mean  square  error  may  be  obtained  by 
substituting  (3-11)  into  (3-7) 

E (e2(n)]  min  = (n)  - (3-12) 

3 . LMS  Algorithm 

In  seeking  the  minimxim  mean-square  error  by  the 
method  of  steepest  descent  , one  begins  with  an  initial 
guess  as  to  where  the  minimum  point  of  the  mean-square  error 
surface  may  be.  This  means  that  one  begins  with  a set  of  initial 
conditions  for  the  weights.  The  gradient  vector  is  then  mea- 
sured, and  the  next  guess  is  obtained  from  the  present  guess 
by  making  a change  in  the  weight  vector  in  the  direction  of  the 
negative  of  the  gradient  vector.  The  method  of  steepest  descent 
can  thus  be  described  by  the  following  relation 

W(n+1)  = W(n)  + kVtE(e^(n))]  (3-13) 

The  expression  for  V[E(e  (n)  )]  is  obtained  by  using 
Equation  (3-10) . 

W(n+1)  = W(n)  + 2kR^jjW  - 2k  (3-14) 

2 

The  gradient  vector  V[E(e  (n))]  is  the  gradient  of  the  expectation 
of  the  squared  error  function  when  the  weight  vector  is  W(n) . 

When  the  performance  function  is  quadratic,  the  gradient 
is  a linear  function  of  the  weights.  The  advantage  of  working 
with  the  quadratic  performance  surface  lies  both  in  this  linear 
relation  and  in  the  fact  that  such  a surface  has  a unique 
minimum  . 


48 


The  purpose  of  the  adaptation  process  is  to  find  an  exact  or 
an  approximate  solution  to  the  Wiener-Hoff  equation  (3-11) . 

One  way  of  finding  the  optimxun  weight  vector  is  simply  to  solve 
(3-10) . Although  this  solution  is  generally  straight  forward, 
it  could  present  serious  computational  problems  when  the  num- 
ber of  weights  N is  large  and  when  input  data  rates  are  high. 

In  addition  to  the  necessity  of  inverting  an  ri  x N matrix,  this 
method  may  require  as  many  as  ^l(N+l)/2  autocorrelation  and  cross 
correlation  measurements  to  be  made  to  obtain  the  elements  of 

^x'  ^d* 

No  perfect  solution  of  equation  (3-11)  is  possible  in  prac- 
tice to  estimate  perfectly  the  elements  of  the  correlation 
matrices. 

A method  for  finding  approximation  solutions  to  (3-11)  is 
presented  below.  The  accuracy  of  this  method  is  limited  by 
statistical  seimple  size,  since  weight  values  are  found  that 
are  based  on  finite- time  measurements  of  input-data  signals. 

This  method  does  not  require  explicit  measurements  of 
correlation  functions,  nor  does  it  require  matrix  inversion. 

It  is  the  "LMS"  algorithm  based  on  the  steepest  descent  method. 
This  algorithm  does  not  even  require  squaring,  averaging,  or 
differentiation  in  order  to  maJce  use  of  gradients  of  mean- 
square  error  functions . 

When  using  the  LMS  algorithm,  changes  in  the  weight  vector 
are  made  along  the  direction  of  the  estimated  gradient  vector. 
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Accordingly 


V7(n+1)  = W(n)  + kV{  [E(e^(h) ) ] } (3-13) 

Where 

W(n)  = weight  vector  before  adaptation 
W(n+1)  = weight  vector  after  adaptation 
k = scalar  constant  controlling  rate  of  convergence 

and  stability  (k<o) 

^2  2 

V[E(e  (n))]=  estimate  of  gradient  of  E[e  (n) ] with  respect 
to  with  W = W(n) 

One  method  for  obtaining  the  estimated  gradient  of  the 
mean  square  error  function  is  to  take  the  gradient  of  a single 
time  sample  of  the  squared  error;  that  is 

V[E(e^(n)]  = V[e^(n)]  = 2e(n)  V[e(n)]  (3-14) 

From  Equation  (3-4) 

V[e(n)]  = V [y  (n) -d  (n)  ] =V  (n)  3(  (n) -d  (n)  ] 

= X(n)  (3-15) 


Thus , 

^2  — 

V[E(e^(n))]  = 2e(n)X(n) 

= 2[W^(n)X(n)-d(n)]X(n)  (3-16) 

The  gradient  estimate  of  (3-16)  is  unbiased,  as  will  be  shown 
by  the  following  argument:  For  a given  weight  vector  5T(n) , the 
expected  value  of  the  gradient  estimate  is : 

EtV  [E(e^(n))]}=  2E{  [W^(n)?r(n)-d(n)]3r(n)  } 

= 2RxxW-  2lScd  (3-17) 
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Comparing  (3-17)  and  (3-10) , we  see  that 

E{  V(E(e^(n))]}  = V E [e^ (n) ] (3-18) 

and  therefore  for  a given  weight  vector,  the  gradient  estimate 
A 2 

V[E(e  (n) ) ] IS  unbiased. 

Using  the  gradient  estimation  formula  (3-16) , the  weight 
iteration  rule  Equation  (3-13)  becomes 

W(n+1)  = W(n)  + 2k  e(n)  X (n)  (3-19) 

and  the  next  weight  vector  is  obtained  by  adding  to  the  present 
weight  vector  scaled  by  the  value  of  error.  This  is  the  LMS 
algorithm.  Looking  at  Equation  (3-19) , the  adaptation  process 
is  a simple  first  order  recursion  equation  which  can  be  realized 
as  shown  below. 
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IbtiTis point,  the  basic  concept  of  LMS  nonrecursive  adaptive  filter 
has  been  introduced  and  reviewed.  More  details  can  be  seen  in  [1]. 
Widrow  [1]  showed  that  the  weight  vector  mean  converges  to  the 
Wiener  solution  aidtiatthe  boundsof  tie  step  size  k should  be  in  the 
region  such  that 

- yi — < k < o for  the  stability  and  convergence, 

'^max 

where  is  the  maximum  eigenvalue  of 

luaX  XX 

4 . Two-dimensional  adaptive  filter, 
a.  Adaptive  filter  structure 

The  input-output  relation  of  the  two-dimensional 
filter  is  given  by  two-dimensional  convolution. 

P q 

y(k,Jl)  = E ^ "in  (3-20) 

i=o  j=o  ^ 

where  y (k,  1)  is  the  filter  output 

and  W^j  is  the  fiiiteimpulse  response  of  filter. 

Here,  it  isassumed  that  the  iiput,  X(k,£)  is  a stationary  random 
field. 

In  Equation  (3-20) , a set  of  two-dimensional 
stationary  input  signals  is  weighted  and  summed  to  form  an  out- 
put signal  and  the  filter  output  is  intended  to  match  a desired 
(reference)  signal  in  accordance  with  the  minimization  of  mean 
squared  error,  where  the  error  is  the  difference  between  filter 
output  and  desired  signal. 

e(k,Jl)  = y(k,Jl  ) -d(k,£).  (3-21) 


52 


Introducing  the  vector  notation  such  that 


oo  ol 


% WlO  ^11 


% W20 


and 

= [x(k,£),  x(k,)l-l),  x(k,Z,-q)x(k-l,2,)x(k-l,£-l)  . 

x(k-2,Jl) x(k-p,£  -q)  ] (3-22) 

then  Equation (3-20) can  be  written  in  matrix  form. 

y(k,Jl)  = W^X  = X^W  (3-23) 


where  W is  a weight  vector  of  dimension  (p+1) (q+l)xl 

X is  a input  signal  vector  of  dimension  (p+1) (q+l)xl 
The  weight  vector  of  the  filter  is  supposed  to  be  adjusted 
the  direction  such  that  performance  criterion  (mean  square 
is  to  be  minimized.  Thus,  the  linear  combinatorial  system 
in  Equation  (3-20)  will  be  given  with  variable  weights. 


AowTivi  riLtu 


.X (k-1, £-q) 


in 

error) 
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In  Figure  3-3,  the  linear  combinatorial  structure  is  given.  Then, 
the  adaptive  nonrecursive  filter  can  be  drawn  as  following; 


FIGURE  3-4  STRUCTURE  OF  NONRECDRSIVE  ADAPTIVE 
FILTER 


b.  Wiener  solution 

From  equation  (3-22)  and  (3-23) , the  error  signal 
can  be  written  by 

e(k,«.)  = W^X  - d(k,£)  (3-24) 

The  square  of  this  error  is 

e^(k,Jl)  = - 2d(k,jl)  + d^(k,Jl)  (3-25) 

2 

The  expected  value  of  e (k,jl)  is 

E[e^(k,i)  ]=E[d^(k,£)  ]-2R^jW^+W^Rj^^W  (3-26) 

where  the  vector  of  cross  correlations  between  the  input 
signal  and  desired  response  is  defined  as 
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d(k,  Ji)x(k,  i) 
d(k,  £)x(k,  £-1) 


E[d(k,  A) Xl  = E 


d(k,  Jl)x(k,  jt-q) 
d(k,  Jl)x(k-1,  l) 
d (k,  £)x(k-l,  Z-q) 


(3-28) 


|^d(k,Jl)x(k-p,)i-q)J 

and  where  the  correlation  matrix  of  the  input  signals  is 
defined  as 

E[X5r'^]  = 


XjXi 

X2X1 

X jc  • • • 

...  X3^X(p+q)  (q+i) 

X2  X2 

X 2 X3 

•••  ^^(p+1)  (q+1) 

(3-29) 

X3X1 

^3(^2 

X3X3 

*•*  ^^(p+l)(q+l) 
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The  gradient  at  any  point  on  the  performance  surface  can  be 
obtained  by  differentiating  the  mean  square  error  function  of 
equation  (6) 

V[E(e^(k,Jl)  )]  = ^ 

3 w 


To  find  the  "optimal"  weight  vector  yield  the  least 

mean  square  error,  set  the  gradient  to  zero.  Accordingly, 


'xd 


W 


LMS 


"xd 


(3-30) 


Equation  (7)  is  the  Wiener  Hopf  equation  in  matrix  form,  again 
the  minimum  mean  square  error  is  obtained  by  substituting 
(3-30)  into  (3-26) . 

E(e2(k,il)]min  = Efd^OcZ))  - (3-31) 

c.  LMS  Algorithm 

Consider  a two-dimensional  field  x(k,£)  to  be  processed 
(usually  two-dimensional  filters  are  used  in  process- 

ingdiscrete  two-dimensional  image  fields)  and  assume  that  the 
two-dimensional  field  consists  of  NxN  discrete  points  (vMch  may  be  a 
sensed  signal  by  NxN  pixel  elements  of  a sensor) . The  adaptation 
processes  is  that  of  adjusting  the  filter  coefficient  in 
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accordance  of  minimization  of  the  mean  square  error  as  in  the  one- 
dimensional case.  The  adaptation  scheme  may  be  predetermined  as 
being  columnwise  scanning  or  diagonal,  or  row-wise  scanning. 

Here  the  row  scanning  process  is  adopted  as  shown  in  Figure  3-5. 
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Thetefore,thBNxNadaptation  processes  will  be  required  to  complete 
the  filtering  of anNscN  two-dimensional  field.  Using  e(k,J!.) 
to  denote  the  error  at  the  jth  iteration  (then  e(k+l,ji)  is 

the  (j+l)th  error),  then  the  filter  coefficient  updating  process 
can  be  described  by 

W(j+1)  = W(j)  + uV[E(e^(k,i)  ] 

where  W(j+1)  = coefficient  vector  after  adaptation 
W (j)  ^ coefficient  vector  before  adaptation 

u - negative  scalar  constant  controlling  rate 

of  convergence  and  stability. 

The  gradient  of  the  mean  square  ernor  is  to  be  estimated  by 
V[E(e^(k,Jl)  ] = V[e2(k,)l)] 
where  e(k,)l)  = y(k,ll)  - d(k,Jl) 
then  W(j+1)  = W(j)  - ue(k,J!,) V [e(k,£)  ] 

= W( j)  - 2ue(k,£)X 

where  X is  defined  by  equation  (3-22) . 

Along  with  y(k,Z)  = 

and  e(k.i!.)  = y(k,)l)  - d(k,)l), 
the  LMS  algorithm  will  be  completed. 

B.  RECURSIVE  FILTER 
1.  Introduction 

In  the  previous  section,  it  is  shown  that  adaptive  non- 
recursive filters  have  a finite  impulse  response;  that  is,  they 
can  produce  only  zeros  with  no  poles  in  the  filter  transfer 
function.  This  limits  the  capability  of  transversal  adaptive 
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filters  in  many  applications.  To  overcome  this  limitation  a 
new  adaptive  filter  structure  is  described  which  is  capable  of 
producing  poles  in  the  transfer  function.  The  basic  configura- 
tion considered  here  is  quite  standard;  that  is,  the  present 
output  sample  of  the  filter  y{n)  is  a linear  combination  of  the 
present  and  past  samples  of  the  input  x(n) , x(n-l),  . .x(n-M) 
and  the  past  samples  of  the  output,  y(n-l),  y(n-2) . . . y(n-N). 
The  present  output  sample  of  the  filter  is  compared  against  a 
reference  sample.  The  resulting  error  samples  are  used  to  adjust 
the  filter  parameters,  feed  forward  gains  and  feed  back  gains  to 
minimize  some  error  function.TheOie-dimensional  recursive  filter 
is  developed  first,  then  it  is  extended  to  the  two-dimensional 
adaptive  filter. 

Recently  Feintuck  [2]  and  White  [3]  have  proposed  a technique 
for  making  digital  filters  with  zeros  and  poles  adaptive.  This 
development  may  enhance  the  possiblity  of  cbtaining  accurate 
models  for  unknown  systems.  The  new  approach  is  developed  into  an 
algorithm.  It  employs  the  steepest-descent  criterion  for  para- 
meter adjustment  but  it  differs  in  the  estimation  of  mean 
squared  error  gradient  vector  from  Feintuck  [2]  and  Widrow  [1] . 

2 . One-Dimensional  Adaptive  Recursive  Filter 
a.  Structure 

The  recursive  filter  is  described  by  its  transfer 

function 
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Y(Z) 

X(Z) 


M 

Z 

i=o 
1 + 


N 

Z 

i=l 


(3-32) 


In  the  time  domain,  the  input-output  relation  or  the  digital. 

filter  is  given  by 

M N 

y(n)  = Z b.x(n-i)-  Z a.y(n-i)  (3-33) 

i=o  i=l 


where  y(n)  = nth  sample  of  the  filter  output 
x(n)  = nth  scunple  of  the  filter  input 
a^^  = feedback  coefficients  i = 1/2,.  N 

bj^  = feed  forward  coefficients  i=0,l,2..N 
The  output  samples  of  the  filter  are  intended  to  match  those  of 
a reference  (or  desired)  signal  d(n).  in  accordance  with 

the  minimization  of  some  error  criterion,  the  filter  parameters 
*i'  ^i  adjusted  at  every  iteration  Thei^neral  scheme  of  the 

adaptive  recursive  structure  is  given  in  Figure  3-6.  The  two 
finite  length  transversal  filters  are used  in  the  forward  path  and 
feedback  path  to  form  the  recursive  filter  of  Equation  (3-331 
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d(n) 

reference  signal 


FIGURE  3-6  ADAPTIVE  RECURSIVE  LMS  FILTER 
USING  TWO  TRANSVERSAL  ADAPTIVE 
FILTERS 


b.  Problem  in  Wiener  Solution 

Introducing  vector  notation  for  the  signals  and  sets 
of  filter  coefficients  we  have 
= [ aj^,  a2»  . . • • aN] 

bj^,  . . . . bj^] 

XCn)*^  = [ x(n),  x(n-l)  . ...  x(n-M)] 

YCn)"^  = [ y(n-l),  y(n-2)  . . . y(n-N)] 
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Then 


Equation  (3-33)  can  be  written  as 
y(n)  = B^X(n)  7(n)  (3-34) 

where  A is  the  feedback  coefficient  vector  (nxl) 

Bis  the  feed  forward  coefficient  vector  [ (M+l)xl] 
X(n)isliEi^t  signal  vector  at  nth  iteration  [ (M+lxl] 
Y(n)isthe output  signal  vector  at  nth  iteration  (Nxl) 

The  performance  criterion  is  again  minimum  m^ ~n  squared  error, 
where  tie  aroristhe  difference  between  filter  output  and  desired 
signal  (reference  signal) . 

That  is,  the  filter  is  used  to  estimate  a desired  waveform 
d(n)  in  a minimum  mean  square  error  sense.  Assume  that  the 
observables  are  stationary  and  zero  mean  and  let  e(n). denote  the 
error  waveform  at  nth  sample,  then 

e(n)  A y(n)-d(n)  (3-35) 

= B'^X(n) -A'^Y(n)  -d(n) 
and  the  mean  square  error  is 

E[e^(n)]  = E[(B’^X(n)-A^Y(n)-d(n)  )^] 

= E [B^X(n)X(n)'^B-2B^X(n)  Y^n)  A+A^Y(n)Y^A 
-2B^d(n)X(n)+2A'^d(n)Y(n)+d^  (n)  ] 

= E [d^  (n)  ] +B^Rjjj^B+^RyyA-2B^Rj^yA 

-2B^R^j^+2A^R^y  (3-36) 

= E[X(n)x'^(n)] 

= E[Y(n)Y^(n)] 

= E [d(n)X(n)  ] 

= E[d(n)Y(n)] 

= E(X(n)!r^  (n)] 


where 

Ryy 

^dx 

%y 

and  Rj.y 
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The  theory  of  Wiener  filtering  employs  known  second -order  input 
statistics  to  dictate  the  impulse  response  of  the  linear  filter 
that  minimizes  the  mean  square  error;  that  is,  as  in  the  previous 
section,  the  knowledge  of  second  order  statistics  is 

assumed  to  calculate  the  optimxim  impulse  response 
(optimum  weight  vector  in  nonrecursive  adaptive  filter)  Wjj 
But  here  in  the  recursive  algorithn,  it  is  also  required  that 
the  autocorrelation  of  the  output,  Ryy»  and  the  cross  correlation 
of  the  output  and  the  input,  and  the  cross  correlation  of  the 
output  with  the  desired  waveform .R^^,  should  be  assumed  known. 

Thus,  the  set  of  statistics  mentioned  above  is  assumed  to  be  known 
for  a moment,  and  will  be  used  to  determine  the  weights  in 
the  recursive  filter.  The  statistics  for  the  fixed  para- 

meter filter  are  not  a function  of  these  statistics,  but  instead 

the  weights  are  a function  of  these  statistics.  Therefore,  R„,,, 

xy 

R,  and  R are  to  be  considered  constant  when  the  differentiation 
Qx  yy 

is  made  with  respect  to  A and  B. 

The  set  of  weights  (filter  coefficient  vectors)  which  minimize 
the  mean  square  error  can  be  found  by  getting  the  gr-dient  vector 
with  respect  to  filter  parameters  equal  to  zero. 


and 


aE(e‘^(n)  ] 
3 A 


3 E[e^(n)] 
3^ 


B 


= 2R'R-2R'B  + 2R,  = 0 

yy  xy  dy 

= R (R,  - R "B) 

yy  dy  xy 


J(E(e^(n))) 

2R  B - 2R  A - 2 R^ 
XX  xy  dx 


(3-37) 


(3-38) 
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Thus,  one  can  solve  for  the  filter  coefficients  if  all  the 
second  order  statistics  are  known.  But  without  knowing  the 
impulse  response  of  the  filter,  the  R^jy  and  RyyCan  not  be 

calculated  with  only  input  and  reference  signal  statistics. 

Noting  that  we  are  looking  for  the  impulse  response  which 
minimizes  the  mean  square  error  in  some  way,  it  is  clear  that 
^xy'  ^dy'  ^yy“^®  available  and  so  the  wiener  approach  is 
not  feasible. 

C . LMS  ALGORITHM 

An  iterative  gradient  search  technique  (the  method  of  steepest 
descent)  is  revisited.  Here,  in  the  recursive  algorithm,  it 
updates  the  filter  coefficients  with  steps  proportional  to  the 
gradient  vector.  This  updating  process  is 
A(n+1)  = A(n)  + k,AA 

= A(n)  + k^V^lECe^n)  ) ] (3-39) 

B(n+1)  = B(n)  + kj^AB 

= B(n)  + kj^VB[E(e2(n)] 

where 

A(n) , B(n)  - filter  coefficient  vectors  before  adaptation 
A(n+1)  ^ B(n+1)  ^ filter  coefficient  vectors  after  adaptation 
k^,  kj^  = scalar  constants  controlling  rate  of  convergence 
and  stability  (k^, 

[E (e^ (n) ) ] , 7g[E(e^(n))l  - gradient  vectors  with  respect 

to  A and  B respectively. 
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The  updating  process  (3-39)  can  be  considered  as  afirst  order  fil- 
tering process  with  an  input  proportional  to  thegradient  vector. 

But  the  gradients  V^.[E  (e^  (n) ) ] and  Vg[E(e^(n)  ) ] should  be  esti- 
mated because  the  output  statistics  are  not  available  apriori  or 
an  infinite  statistical  sample  would  be  required  to  estimate 
perfectly  the  elements  of  the  correlation  matrices  in  Equations 
(3-37) and  (3-38) . A method  of  estimating  these  gradients  will 
be  presented. 

Widrow  [1]  obtained  the  estimated  gradient  of  the  mean  square 

error  function  by  taking  the  gradient  of  a single  time  sample  of 

the  squared  error  (instantaneous  estimates)  when  he  discussed 

the  nonrecursive  adaptive  filter  (see  previous  section) . 

Here,  in  this  thesis  work,  a new  method  of  estimating 

gradients  is  proposed.  This  is  to  approximate  the  mean  squared 
2 

error  E(e  (n)  ] by  an  average  of  a finite  number  of  points  at  every  iteration 

and  take  the  gradient  of this  instead  of  taking  the  instantaneous 

error  square,  that  is,  the  approximation  used  is 

2 ^ L— 1 2 

E(e‘^(n))~^  I e^(n-£)  (3-40) 

^ £=0 

For  E(e  (n) ) , the  average  of  the  square  error  for  the  previous  L points 
is  taken  and  then  gradient  is  evaluated  for  the  approximate  mean 
square  error.  The  estimated  gradient  of  mean  square  error  is 

V-E[e^(n)]  =V-[  ^E^e^(n-£)] 

^ ^ Jl=0 

V„E(e^(n)]  =Vn[  ^:^e^(n-Jl)]  (3-41) 
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For  convenience,  the(l ) term  in  (3-40)  was  dropped. 

1j 

vector  notation  for  error  signal 


e (n) 


e(n) 

e(n-l) 


Introducing  the 


[ e(n-L+l)  J 

then  it  is  seen  tlat  the  error  signal  vector  is  an  (Ixl)  vector.  The 
^timated  gradient  (Equation  (3-41) ) can  be  put  into  the  niatrix  form 
V^E[e^(n)]  = e'^(n)  e(n)] 

7gEte2(n)]  = Vg[  e'^(n)  e(n)] 

Substituting  the  estimated  gradients  in  Equation  (3-39) , the  updating 
process  for  the  filter  coefficients  is; 

A(n+1)  = A(n)  + (n)7 (n)  ] 

and 

~(n+l)  = S'(n)  + (e"'^ (n)  e (n)  ] (3-42) 

The  function  i^(n)  r(n)  is  a scalar  function  of  the  ooefficient 
vectors  A cuid  B,  that  is, 


e^(n)e  (n)  = f (1^,  B) 

Therefore  , by  definition,  the  gradient  of  f(A,  1)  with  respect 
to  A and  1 is 
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It  follows  that 


and 

A 

VgCECe^Cn)  ))  Vg{F(n}'^e(n)  ) 


2F,nT 

2 F(„,^ 

o:r,_vT  3F(n) 


Consider  the  terms  3e(n),  9£(n)  in  equation  (3-43) , (3-44) , 

P q 

where  p=l,2,...N  and  q = 0,1,2,...M. 

Since  e (n)  and  e(n)  are  defined  as 
e(n)  ^ y(n)  - d(n) 

e(n)'^=  (e(n),  e(n-l),  . . . e(n-L+l), 


and  e(n)'^isan  (IxL)  vector. 


Therefore,  V^[E(e^(n))]  and  VglE(e^(n)  )]  in  Equation  (3-41) 
are  (Nxl)  and  [(M+l)xl]  vectors  respectively. 

Equation^  (3-4.5)  and  (3-46)  may  be  considered  as  sensitivity 
vectors  which  tell  how  much  the  change  in  dp  and  bq  affect 

the  outputs  y(n) , y(n-l)  . . . y(n-L+l) . 

To  calculate  the  elementsof the  estimated  gradients  of  mean  square 

error  in  equation  (3-41) , we  should  calculate  the  sensitivity 

vector  of  equation  (3-45)  and  (3-46)  first. 

From  the  recursive  equation  (3-33) , 

3y(n)  9 . M N , 

= { 1 b.X(n-i)  - E A.y(n-i)  1 

33p  3dp  ’•  i=o  i=l  ^ 


= - y(n-p) 


I A.  i-yJiL-i) 

i=l  ^ 3 dp 


(3-47) 


P = 1,  2, 


3y(n) 


q=0,  1,  2,  ...M 


N 


(3-48) 


The  sensitivity  vector  components  given  by  Equations  (3-47)  and 
(3-48)  can  be  interpreted  as  being  the  response  of  a linear  system 
with  transfer  function. 


H(Z)  = 


l+Sj^S!  + a2 


-N 


(3-49) 


Henceforth, this  will  be  called  the  "sensitivity  filter."  Equation 
(3-49)  is  mall  pole  filter  (recursive  filter)  with  input  signals 
[-y(n-p)I  and  [x (n-q) J respectively. 
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Now,  what  are  the  initial  conditions  characterizing  the  re- 
cursive relationships  of  the  sensitivity  filter? 

x(n-q)  is  the  q time  units  delayed  signal  of  input  x(n)  to  the 
adaptive  recursive  filter  of  Equation  (3-33) and  y(n-p)  is  the  p 
time  units  delayed  signal  of  ttie  output  y(n)  of  the  recursive 
filter . 

From  x(n)  = o 

y(n)  = o for  n<  o , 

x(n-q)  and  y(n-p)  are  sequences  with  the  first  q elements  and  first  p 
elements  zero  regjectively.And  since  changes  in  the  ap  and  bq  coeffici- 
ents have  no  effects  on  the  system's  response  until  n=p  and  n=q 
respectively,  it  follows  that  the  initial  conditions  are: 

= o for  n=o,  1 ...  p-1 

= o for  n=o,  1 ...  q-1 

A summary  of  this  algorithm  is 
the  following : 

1.  Calculate  the  sensitivity  vector  components  through  the 
sensitivity  recursive  filter  by  equations  (3-47)  and  (3-48)  . 

2.  Calculate  the  estimated  gradient  by  equation  (3-43)  and  (3-44) . 

3.  Calculate  the  filter  coefficient  vector  by  equation  (3-42) . 

4.  Calculate  the  filter  output  by  equation  (3-34) 

5.  Form  the  e(n)  vector  F(n)  , then  go  bac)c  to  the  1st  step. 

Note  that  due  to  the  fact  that  the  gradient  of  finite  point  square 
error  average  is  used  for  the  estimation  of  the  true  gradient  of 
mean  square,  this  filter  cannot  give  an  optimal  solution,  but  the 
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more  averaging  points  are  used,  the  better  performance  is  expected. 
ItshoiiHbe  noted  that  if  L=l,  that  is. 


V^[E(e2(n))]  = 7^[e^(n)] 

Vg[E(e^(n))]  = Vg[e2(n)]  , 

this  corresponds  bousing  the  instantaneous  error  square  for 
estimatingthe  gradient,  and  this  filter  reduce s to  the  adaptive 
recursive  filter  proposed  by  White  [3] . If  the  further  approxima- 
tion is  made  that  the  sensitivity  components  of  equation 

(3-47),  (3-48)  are 


then  the  estimated  gradient  is 

V^[E(e^(n)  )]  = - 2 e(n)  [ y(n-l) 


y(n-2) 


y (n-N) 


and 


Vg[E(e  (n)  ) ] = 2 e (n)  I x(n) 


x(n-l) 


X (n-N) 


= -2e(n)Y 


(3-50) 


= 2e(n)3( 


(3-51) 
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F 


Then  the  filter  coefficient  updating  process  is 


A(n+1)  = A(n)-2k  e(n)Y 

21 

B(n+1)  = B (n) +2kj^en)3f 

(3-52) 

where  k^,  kj^<  0 and  filter  output 

y(n)  = Kn) '^3?-S'^(n)Y 

(3-53) 

Equations  (3-53)  , (3-52)  , (3-51)  , (3-50)  are  exactly  the  scime  as 
the  algorithm  proposed  by  Feintuck  [2] . This  Feintuck  algorithm 
has  an  advantage  in  simplicity  when  compared  with  the  algorithms 
proposed  by  White  [3]  and  propos^  here  which  require  additional 
recursive  filters  to  generate  the  estimates  of  the  gradient. 
Thus,  it  may  be  useful  to  extend  the  Feintuck  algorithm  to  the 
two-dimensional  recursive  adaptive  filter  for  simplicity.  In 
the  next  section,  the  algorithms  proposed  by  White  and  Feintuck 
are  extended  to  the  two-dimensional  algorithm. 

3 . Two-dimensional  Recursive  Adaptive  Filter 

In  this  section,  a mathematical  model  of  the  adaptive 
recursive  filter  for  the  processing  of  two-dimensional  signals 
is  proposed.  This  can  be  considered  as  an  extension  of  Fein- 
tuck 's  algorithm  to  two-dimensional  filters. 

Two  transversal  filters  having  the  s€une  structure  as  the 
linear  combinatorial  system  used  in  the  non-recursive  two- 
dimensional  processor,  are  used  in  the  recursive  processor, 
one  for  the  feedforward  path  and  one  for  the  feedback  path. 

The  two-dimensional  recursive  filter  is  described  by 
its  transfer  function. 
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Y 


X 


P 


i^C 


1 + 


In  the  spatial  domain, 
ter  is  given  by 


q 

Z 


m=o  5=0  ""  ^ ^ 

{m,n)?^(0,0) 

the  input-output  relationcfthedigital  fil- 


y(k,£) 


p q M N 

Z Z b . .X (k-i, £-j ) - Z Z a y(k-m,£-n) 

i=0  j=0  n=0  n=0 

(3-54) 


The  following  notation  is  introduced 

B — ^ 1 • • • • b<=«*«*«  = «»b  3 

oo  01  oq  10  11  Iq  20  pq 

= [x(k,£)  ,x(k,£-l)  , . .x(k,£-q)x(k-l,£)x(k-l,£-l) x(k-l,£-q) 


x(k-2,  £ x(k-p,  £-q)  ] 


and  a ^2*  • ® l(f  11*  * •®1N®20 

Y^=ty(k,£-1)  , y(k,£-2)  . . .y  (k,£-N)  y (k-l,£)y  (k-l,£-l)  . . .y(k-l,£-N) 

y(k-2,  £)  y(k-M,£-N)  1 

The  filter  coefficient  vectors ”a“ and ~B*  are  [ (M+1) (N+1) -1] xl 
and  (p+1) (q+l)xl»  respectively,  and  the  input-output  signal  vectors 
are  again  (p+1)  (q+l)xl  and  [ (M+1)  (N+1) -l]xl,  respectively . 

Then  equation  (22)  can  be  written  as 

y(k,£)  = b’^X-a'^  (3-55) 

Here,  to  obtain  an  estimate  of  gradients  of  the  mean  square  error 
function,  a single  seunple  of  the  square  error  is  taken.  That  is: 

V [ECe^Ck,£)3]  =VIe^(k,£)]=2e(k,£)V  [e(k,£)]  , and  again  the 
adaptation  scheme  (filter  coefficient  updating  process)  is  used 
in  the  same  fashion  as  in  the  nonrecursive  case  [see  Figure  3-5] . 
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Denoting  the  error  at  jth  iteration  as  e(k,£)  then 

A(j+1)  = A(j)  + 2k^e(j)7^(e(j)] 

B (j+1)  = B(j)  + 2kjje(j)V^(e(j)]  (3-56) 


where 

e(k,£)  = e(j)  = y(k,)l)  - d(k,£)  (3-57) 

The  componentscfthevectorsVj.  [e  (k, £)  ] andv_  [e  (k,£' ] can  be 
calculated  as  following. 


From  Equation  (3-57) 


V^[e(k,£)]  =V^[y(k,£)] 


3^(k,£)  . . 3y(k,£)  3y (k^g,)  . .3y(k,£)  3y(k,£) 


'01 


OM 


UO 


.3y (k,i) 
^ MN 


IM 


(3-58) 


3 a 


20 


and  Vg[e(k,£)]  =Vgty(k,)l)] 


3y(k,l)  ...  3y (k,£)  3y(k,£)  ...  3y (k,£)  3y (k,£) 

‘ D 3b,  „ 3b,  _ 3b' 


3t 


oo 


oq 


10 


3y (k,£) 


pq 


iq 


(3-59) 


20 


Note  thatV.  [e  (k,jll)  ] and  7„  [e(k,£)  ] havethesame  dimensions  as  "R  and 

A s 


B,  that  is: 

[ (M+1) (N+1) -1]  X 1 and  [ (p+1) (q+1) ] /respectively . 
From  the  recursive  relation  of  Equation  (3-54) 


3y(k,)l) 

— 


rs 


and  3^ 


(k/£) 

uv 


-y (k-r,£-s) 


M 

Z 

m=o 


N 

Z a 


n=o 


mn 


^y  (k-m,ji,-n) 
^ ®rs 


x(k-u,y-v)  “ j 2 a 3 y(k-m,£-n) 
m*o  n=o  mn  3 b 
{m,n)?^t0/0) 


(3-60) 

(3-61) 
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The  recursive  relationship  of  Equations  (3-60)  and  (3-61)  should 
be  noted  again,  which  can  be  implemented  by  additional  recur- 
sive filters. 

Forming  the  instantaneous  error  gradient  of  Equations 
(3-54) , (3-55)  using  the  output  of  additional  recursive  filters 
of  Equations  (3-60)  (3-61) , the  filter  coefficient  adaptation 
process  of  Equations  (3-56)  (3-57)  can  be  performed.  Note  that 

this  algorithm  corresponds  to  the  two-dimensional  version  of  the 
algorithm  proposed  by  White  [3] . 

If  we  make  the  approximation 


= - y(k-r,)l-s) 

^®rs 

.x(k-u,  y-v), 
uv 


then  it  follows  that 


V^[e(k,il)]  = 


y(k,Ji-i) 

y(k,£-2) 


y(k-l,£) 

y(k-l,£-l) 


• • 


y(k-2,£) 


A - 
= Y 


[y(k-M,£-N)  J 
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and 


r 


VgleCkjJl)] 


x(k,Jl) 

X (k,  Jl-1 ) 


x(k,£-q) 

x(k-l,a) 


l^x(k-p,A-q)  J 


Therefore, 
e(k,A) 
A( j+1) 
B(j+1) 


in  this  case,  the  cxirplete  algorithn  is  described  by 
= e(j)  = y (k,i) -d (k,£) 

= A(j)-2k^e(j)  Y 
= B(j)  +2kj^e(j)  X 


and  Y(k,£)  = b'^3?  - A? 

This  is  the  two-dimensional  version  of  the  algorithm  proposed 
by  Feintuck  [2] . 
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IV.  ADAPTIVE  NOISE  CANCELLER 


A.  THE  CONCEPT  OF  ADAPTIVE  NOISE  CANCELLING 

Noise  cancelling  is  a variation  of  optimal  filtering  that 
is  highly  advantageous  in  many  applications.  Specially  in 
Wiener  filtering  or  Kalman  filtering,  which  are  optimal, 
apriori  knowledge  of  both  signal  and  noise  statistics  are 
required.  Adaptive  filters,  on  the  other  hand,  have  the 
ability  to  adjust  their  own  parameters  automatically,  and  their 
design  requires  little  or  no  apriori  knowledge  of  signal  and 
noise  statistics  while  the  Wiener  approach  utilizes  a fixed 
parameter  filter  based  on  known  statistics. 

Figure  (1-1)  shows  the  basic  problem  and  the  adaptive 
noise  cancelling  solution  to  it.  It  makes  use  of  a reference 
input  derived  from  one  (or  more)  sensors  located  at  the  points 
in  the  noise  field  where  the  signal  is  weak  or  undetectable. 

This  input  is  filtered  and  substracted  from  a primary  input  con- 
taining both  signal  and  noise.  As  a result  the  primary  noise 
is  attenuated  or  eliminated  by  cancellation. 

At  first  glance,  subtracting  noise  from  a signal  seems  to 
be  a dangerous  procedure.  If  done  improperly  it  could  result 
in  an  increase  in  output  noise  power.  If,  however,  filtering 
and  subtraction  are  controlled  by  an  appropriate  adaptive 
process,  noise  reduction  can  be  accomplished  with  little  risk 
of  distorting  the  signal  or  increasing  the  output  noise  level. 
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FIGURE  4-1.  THE  ADAPTIVE  NOISE  CANCELLING  CONCEPT 


FIGURE  4-Z  NOISE  CANCELLING  WITHOUT  AN  EXTERNAL 
REFERENCE  SOURCE 
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The  following  argument  for  the  above  is  mainly  due  to 

Widrow,  et  al  [4] . In  Figure  (4-1) , a signal  S is  transmitted 

over  a channel  to  a sensor  that  also  receives  a noise  N. 

o 

A second  sensor  receives  a noise 
uncorrelated  with  the  signal  but  correlated  in  some  un- 
known way  with  the  noise  N^.  In  addition  to  these  noises, 

additive  random  noises  M and  M, uncorrelated  with  each  other 

o 1 

and  with  S,  and  N^^are  present.  Then  the  reference  input 
is 

d = s+  Nq  + (4-1) 

and  the  primary  input 

X = j4_2j 

The  noise  N^+  is  filtered  to  produce  an  output  y that  is  as 
close  a replica  as  possible  of  Nq  + ‘ output  is 

subtracted  from  the  reference  input  S + to  produce  the 

system  output 

z = S + N +M  -y 

O O'* 

In  other  words,  the  practical  objective  of  the  noise  cancelling 
system  is  to  produce  a system  output  z=  S+N^+M^-y  that  is 
best  fit  in  the  least  square  sense  to  the  signal  S.  This  ob- 
jective is  accomplished  by  feeding  the  system  output  back  to  the 
adaptive  filter  and  adjusting  the  filter  throughthe  LMS 
adaptive  algorithm  (described  in  previous  chapter)  to  minimize 
total  system  output  power.  Note  that  the  system  output  serves 
as  the  error  signal  for  the  adaptive  process. 
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Assume  for  the  moment  that  the  noises  and  do  not  exist, 

then  if  one  knew  the  characteristics  of  the  channels  over 

which  the  noise  is  transmitted  to  the  reference  input,  it 

would  be  possible  theoretically  to  design  a fixed  filter  capable 

of  changing  N.  into  N . That  is,  if  the  correct  model  of  this 
1 o 

transmission  channel,  H(z),  is  obtained,  the  adaptive  filter 

would  be  simply  1 , a fixed  filter. 

HliT 

Assume  that  S,  N^,  M^,  M^,  and  y are  statistically 

stationary  and  have  zero  means . Assume  that  S is  uncorrelated 
with  Nq  and  N^andthatr^^andM^  ace  uncorrelated  with  each  other  and 
with  S,  Nq  and  N^,  and  suppose  that  is  correlated  with  N^. 

The  output  z is 

z=S+N  +M  -y  (4-3) 

o o 

squaring,  one  obtains 

z^=  S^+  (N  + M - y)^+  2S  (N  + M - y) (4-4) 
o o o o 

Taking  expectations  of  both  sides  and  realizing  that  S is  un- 
correlated with  N # and  y,  yields 

o o -t  • 

E[z^]  = E[S^]  + E[(N^  + - y)^]  +2E[S(  - y)  ] 

= E[S^)  + E[(Nq  + Mq-  y )^  (4-5) 

2 

The  signal  power  E[S  ] will  be  unaffected  as  the  filter  is  ad- 

2 

justed  to  minimize  E[z  ].  Accordingly,  the  minimxam  output  power 
is 

min  E[z^]  = E[S^]  + min  E [ (N^  + - y ) ^]  (4-6) 

2 

since  the  filter  is  adjusted  so  that  E(z  ) is  minimized,  therefore 
2 

E[(Nq  + Mq  “ y)  ] is  minimized.  The  filter  output  y is  then  a 
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best  least  squares  estimate  of  the  noise  N + M . Moreover, 

o o 

2 2 

when  E [ (N^  + -y)  ] is  minimized,  E[(z-S)  ] is  also  mini- 

mized. Since,  from  (4-3) 

z-S  = + Nq  - y (4-7) 

Adjusting  or  adapting  the  filter  to  minimize  the  total  output 
power  is  thus  equivalent  to  causing  the  output  z to  be  a best 
least  square  estimate  of  the  signal  S for  a given  structure  and 
adjustability  of  the  adaptive  filter  and  for  the  given  reference 
input.  The  output  z will  contain  the  signal  S plus  noise. 

From  (4-3) , the  output  noise  is  given  by  (N^  + ■ y) • Since 

minimizing  the  E[z  ] minimizes  the  E [N^  + M^  - y)  ] , minimizing 
the  total  output  power  minimizes  the  output  noise  power.  Since 
the  signal  in  the  output  remains  constant,  minimizing  the  total 
output  power  maximizes  the  output  signal  to  noise  ratio.  Note 
that  if  E [ (Nq  - y)^]  = 0 can  be  achieved  , then  E[z^]  = E(S^), 
therefore  y = Nq  + and  z = S.  In  this  case,  minimizing  output 
power  causes  the  output  signal  to  be  perfectly  noise  free.  Also 
note  that,  on  the  other  hand,  when  the  reference  input  is  com- 
pletely uncorrelated  with  the  primary  input,  the  filter  will  "turn 
itself  off"  and  will  not  increase  output  noise. 

In  this  case,  the  filter  output  y will  be  uncorrelated  with 
the  primary  input.  The  output  power  will  be 

E[z^]  - E[(S+  Mq  + Nq)^]  - 2E[y(s+NQ  + ) ] + E[y^] 

- E[(S  + + Mq)^]  + E[y^]  (4-7) 
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Therefore,  minimizing  output  power  requires  that  E(y^) 

be  minimized,  which  is  accomplished  by  making  all  weights 

2 

zero,  bringing  E ty  ] to  zero. 

It  should  be  noted  that  in  applying  adaptive  techniques 
to  a practical  systems  problem,  the  key  step  lies  in  providing  an 
appropriate  desired  response  signal  for  the  adaptation  process, 
that  is,  the  reference  input  should  be  provided  through  the  ap- 
propriate scheme,  while  the  exact  knowledge  of  statistical 
characteristics are  not  required.  In  adaptive  modeling  applica- 
tions, the  desired  response  is  generally  available  as  the  output 
of  the  unknown  system  to  be  modeled.  And  also  in  the  noise 
cancelling  scheme  above,  the  reference  input  is  available  by 
sensing  noise  which  is  correlated  with  the  noise  at  the  primary 
input  in  some  manner. 

In  next  section,  the  signal  filtering  problem  is  discussed 
when  no  external  reference  input  free  of  signal  is  available. 

B.  NOISE  CANCELLING  WITHOUT  AN  EXTERNAL  REFERENCE  INPUT 

This  section  is  concerned  with  signal  filtering 
(••timation)  a noise-corrupted  signal  when  no  external 
reference  input  is  available.  Here,  it  is  assumed  that  only  the 
>>oiee  -corrupted  signal  is  available,  that  is,  referring  to  the 
' »ir»  4-1  i of  the  previous  section,  the  noise  free  of  signal 


which  is  correlated  with  that  corrupted  the  signal  S is  not 
available;  only  S + is  available. 

It  is  proposed  to  estimate  the  signal  S by  cancelling  the 
noise  in  some  adaptive  way.  In  the  following,  it  is  shown 
how  a reference  input  can  be  obtained  for  the  adaptation  process 
under  certain  conditions.  Assume  that  the  noise  corrupted 
signal  x = S + N is  composed  of  broad  band  noise  N and  a narrow 
band  signal  S,  then  the  autocorrelation  function  of  the  signal 
is  broad  and  that  of  the  noise  is  narrow.  Also  assume  that  noise 
N is  uncorrelated  with  the  signal  and  that  the  mean  values  of 
both  signal  and  noise  are  zero. 

Consider  a signal  delayed  by  6 units, 
x(j-6)  = S (j-6)  + n (j  - 6)  (4-9) 

where  6 is  a sufficient  number  of  time  units  so  that  the  noise 
component  is  decorrelated,  but  the  signal  component  still 
remains  correlated. 

Then 

E[n(j)  n (j-6)  ] = 0 

E(S(j)  S (j-6)  ] 0,  finite  (4-10) 

For  the  two-dimensional  signal,  a signal  delayed  by  6^^,  62 
units  in  the  horizontal  and  vertical  direction  respectively, 
where  6j^  and  62  are  sufficient  length  of  spatial  units  such  that 
the  noise  field  would  be  decorrelated  but  the  signal  field  still 
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remains  correlated,  then 

E[N(k,Jl)  N (k-6]^,Jl-62)  ] = 0 

E[S(k,£)  S {k-6j^,  £,-62)  ] 0,  finite  (4-11) 

and  again  it  is  assumed  the  signal  field  and  noise  field  are  not 
correlated  with  each  other. 

Now  if  this  delayed  signal  is  used  as  a primary  input  and 
the  original  input  used  as  a reference  input  to  the  adaptive  filter, 
then  referring  to  Figure  (4-1)  of  previous  section, 

S(j-6)  or  S(k-62^,il  -62)  can  be  considered  as  Nj^in  Figure  (4-1) 
and  N(j-6)  or  n(k-62^,Jl  - 62)  as  Mj^,  and  S(j)  or  S(k,il)  can  be 
considered  as  and  N(j)  or  N(k,il)  as  M^in  Figure  (4-2),  re- 
spectively. 

From  equation  (4-10)  and  the  assurrptions  tiat  the  signal  and  noise 
are  uncorrelated,  it  is  seen  that  the  assumptions  made  in  the 
last  section  for  the  various  signals  holds  here. 

Therefore,  from  the  argument  in  Section  IV-A, , the 

filter  output  would  be  a good  estimate  of  the  signal  S.  Fig'jre 
4-2  shows  the  noise  cancelling  (or  signal  estimation)  scheme 
Ascussed  above. 
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EXPERIMENT  AND  RESULTS 


In  this  chapter,  a computer  experiment  is  performed  to 
check  the  feasibility  of  the  algorithms  derived  in  Chapter 
nc  for  certain  applications.  The  signal  estimation  problem 
for  a noise  corrupted  signal  is  treated  here  for  both  one- 
dimensional and  two-dimensional  cases.  Nonrecursive  adaptive 
filtering  and  recursive  filtering  have  been  examined  and  the  perfor- 
mances of  adaptive  filters  are  compared  to  that  of  the  optimal 
Wiener  solution.  The  adaptive  noise  cancelling  scheme  is 
used  for  this  application. 

First,  consider  a band  limited  one-dimensional  signal  S 
corrupted  by  noise  N;  it  is  desired  to  estimate  the  signal. 

If  the  statistics  of  both  signal  and  noise  are  known  apriori, 
a fixed  optimal  filter  to  estimate  the  signal  can  be  designed 
by  the  Wiener  Hopf  solution  of  equation  (3-11) . 

Here  it  is  assumed  that  these  statistics  are  not  known 
apriori  but  only  that  the  signal  is  narrowband  and 

the  noise  is  a broad  band  signal  and  the  signal  is  entirely 
uncorrelated  with  the  noise.  Then  the  signal  has  a wide  cor- 
relation function  while  the  noise  has  a narrow  correlation 
function.  Separation  of  this  broadband  noise  and  narrowband 
signal  is  now  required  for  the  estimation  of  the  signal. 

It  is  assumed  further  that  the  desired  (or  reference)  signal 
which  is  needed  for  the adaptive  process  is  not  available,  that 
is,  no  other  possible  reference  signal  is  ctvailable  which ney have  sene 

8S 


correlation  with  the  signal  we  want  to  estimate  . 


This  problem  can  be  considered  as  an  adaptive  noise 
cancelling  problem  without  reference  input.  Assuming  that  the 
noise  is  white,  then  from  the  Figure  (4-2)  one  unit  delay  is 
enough  to  decorrelate  the  noise  component  appearing  in  the  adap- 
tive filter  ii^t  from  the  noise  amponent,  In  the  desired  signal.  These 
components  will  thus  appear  in  the  error  but  not  in  the  filter 
output.  The  narrowband  component,  on  the  other  hand,  will  not 
be  decorrelated  by  the  delay  and  will  appear  in  the  adaptive 
filter  output. 

The  input  signal  would  be 
x(j)  - S(j)  + N(j) 
where  S(j)  bandlimited  signal 
N(j)  white  noise 

and  the  reference  input  would  be 
d(j)  = S(j-l)  + N(j-l)  . 

The  form  of  autocorrelation  function  of  the  signal  is  assumed  as 
R(m)  = pl™lcos  Wo  m 

For  the  purpose  of  computer  simulation,  the  following  values 
are  assigned: 
p = 0.95 
Wo~  0.025 

and  the  variance  of  noise  is  0.5. 
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For  the  optiinuni  filter  design 
filter  having  10  delays  used. 


using  the  ahovevalues  ,a  transversal 
From  equation  (3-11) . 


W -1 

LMS  = R R j 

XX  xd 


The  autocorrelation  matrix  R^  vas  computed  as 
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and  theczDsscorrelation  matrix  R^^  as 

^cd''  = 

^ l.OOOuO  0.90137  "C.6'^9^‘*  0.81044  0.r6774  0.72bB4  0.48747  0.68020  0.6l43( 

Then  the  optimum  Wiener  Hopf  solution  gives  the  filter  co- 
efficients as 


W = Wiener  Weight-Vector  = 


/ \ 

0.33267 

-0.21191 

0.134  75 

0.08556 
0.05*27  — 

—u..  03434 

3.02169 

0*01369 

~0. 00370 
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For  simulation  of  the  bandlimited  signal,  the  state  and  out- 
put equations  of  example  1)  in  Section  2 of  Chapter  II  are 
used. 

For  the  nonrecursive  adaptive  filter  application,  again 
10  delays  and  = -0.005  as  a step  size  in  LMS  algorithm  were 
used,  and  2 delays  for  both  feedforward  and  feedback  path  and 
= -0.001  for  kj , kj,  (equation (3-42)  ),  were  used  in  the  recur- 
sive filter  application  of  both  Feintuck's  algorithm  and  the 
algorithm  developed  here.  Eight  points  were  used  for  error 
square  averaging  for  the  gradient  estimation  (L=8,  in  Equation 
(3-40)  ) . The  experimental  results  are  plotted  in  the  following 
along  with  the  descriptions  and  optimal  solution  for  the  purpose 
of  comparison.  The  results  indicate  the  adaptive  recursive 
filter  appears  to  perform  as  well  as  the  optimal  Wiener  filter 
once  it  reaches  a steady  state  condition. 
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FIGURE  R-1  SIGNAL  STATISTICS 
Autocorrelation  =0.96  ™ cos (0.025  m) 
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FIGURE  R-2  NOISE  CORRUPTED  SIGNAL 
NOISE:  WHITE;  Zero  Mean 

; Variance  = 0.5 
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FIGURE  R-3  WIENER-HOPF  FILTERING 


10  Delays  are  used. 
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FIGURE  R-4 


WIDROW’S  NONRECURSIVE  ADAPTIVE 
FILTERING 

Number  of  delays  used:  10 
Stepsize  Used;  -0.005 
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FIGURE  R-5 


FILTERING  BY  FEINTUCK'S  ALGORITHM 
NUMBER  OF  DELAYS  IN  FEED  FORWARD  PATH;  2 

IN  FEED  BACK  PATH  : 2 
STEPSIZE  USED:  -0.001 
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FIGURE  R-6  FILTERING  BY  THE  ALGORITHM  USING  A 

FINITE  POINT  MOVING  SQUARE  ERROR  AVERAGE 
For  the  estimation  of  gradient 
Number  of  Delays  in  Feed  Forward  Path:  2 

in  Feed  Back  Path  : 2 

Stepsize  Used:  -0.001 
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As  a second  application,  consider  an  image  sensed  by  an  image 
sensing  device  of anNxN  sensing  elements  array.  It  is  assumed 
that  this  imageis  composed  of  correlated  background  and  a 
three  diagonal  line  target  trajectory.  This  image  may 

be  interfered  withby  the  internal  noise  of  device  (assumed  white)  . 
Then  the  output  image  includes  three  types  of  processes; 
x(k,£)  = S(k,X,)  + T(k,Jl)  + w(k,£) 

where 

S(k,)!.)  = correlated  background 

T(k,Jl)  = Target  strength  (three  diagonal  line) 

W(k,£)  = noise. 

Again  it  is  assumed  that  no  statistics  are  known  apriori 
and  the  correlated  background  is  a narrowband  signal.  It  is 
further  assumed  that thecorr elated  background  and  noise  are 
uncorrelated  with  each  other.  It  is  proposed  to  separate 
the  three  diagonal  lines  from  the  background  noise.  Again, the 
scime  argument  holds  that  this  problem  is  a two-dimensional  noise- 
cancelling problem  in  which  no  reference  is  available.  It  is 
further  assumed  that  the  correlated  background  is  a band  pass 
process  for  which  the  autocorrelation  function  is 

^ r ^v  cos  wj^n  cos  V7^n 

where  Py.  represent  horizontal  and  vertical  direction  cor- 
relation coefficients  respectively. 
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Going  back  to  Figure  (4-2)  in  the  previous  section,  the  delay 
^2'^  will  be  sufficient  to  decorrelate  the  white  noise 
and  diagonal  line  target.  Then  as  a result  of  filtering,  the 
system  output  (or  the  residual  field)  will  be  the  desired  sig- 
nal. It  should  be  noted  that  this  residual  signal  is  com- 
posed of  an  estimate  of  three  diagonal  lines  and  white 
noise  as  well  as  some  granularity  due  to  the  fixed  stepsize[l]. 

The  problem  of  enhancing  the  target  diagonal  line  which 
is  subjected  to  the  noise  (white  noise  and  adaptation  noise) 
is  another  problem  of  interest.  It  will  not  be  considered 
in  this  work. 

For  the  purpose  of  simulation,  the  following  values 
were  used: 

1)  Pv  = Ph  “ 0.96  Wj^»  w^  = 0.143 

2)  Pv  = Ph  “ 0*99  “ 0.143 

The  variance  of  correlated  background  = 1.0 
White  noise  variance  = 0.1 

Target  diagonal  line  intensity  » 1.8 

For  the  optimal  filter  design,  using  above  values  for  p^= 

Pj^  » 0.96,  the  Wiener-Hopf  solution  of  Equation  (3-30)is: 

\mS  “ ^xx  ^ ’^xd 
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where  defined  by  equations  (3-22),  (3-28), 

(3-29),  respectively. 

Using  p=3,  q = 3 in  equation  (3-20), 
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Only  the  recursive  simulation  is  performed  and  compared  with 
the  nonrecursive  Wiener  filter.  The  simulation  results  are 
shown  on  the  following  pages  and  indicate  that  although  the 
optimal  performance  of  the  Wiener  filter  is  not  achieved, 
the  adaptive  recursive  filter  performs  well. 
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FIGURE  R-7  CORRELATED  BACKGROUND  + THREE  STREAKS  + 

WHITE  NOISE 

Background;  R(ni,n)  * O.sJ"'!  0.96^^  cos(0.143  m)  cos  (0. 143n) 
Variance  * 1.0 
White  noise:  zero  mean 

variance  = 0.1 
Streaks  Intensity:  1.8 
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FIGURE  R-8  RESIDUAL  SIGNAL  AFTER  WIENER  FILTERING  Figure  R-7 
White  Noise  + Streaks 
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FIGURE  R-10  Backgrounds  + White  noise  + Streaks 

Background:  R(m,n)  = 0.99  0.99  ” cos(0.143ni)cos(0.143n) 

Variance  *»  1.0 
White  noise:  zero  mean 

Variance  =■  0.1 
Streaks  Intensity:  1.8 
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VI . CONCLUSIONS 


The  study  described  herein  has  developed  a new  algorithm 
for  the  one-dimensional  signal  filtering  problem  and  extends 
this  to  two-dimensional  processing.  It  is  an  adaptive  recur- 
sive filtering  algorithm  based  on  the  steepest  descent  gradient 
method  which  employs  the  finite  point  square  error  for  the 
gradient  estimation  rather  than  instantaneous  square  error. 

A simplified  two-dimensional  version  of  this  algorithm 
is  developed.  It  is  designed  to  estimate  the  signal  in  real- 
time operation  in  cases  where  the  statistics  of  both  signal 
and  corrupting  noise  are  not  available  apriori.  The  algorithm 
learns  the  statistics  and  adapts  even  though  it  is  not  optimal, 
which  means  that  it  seeks  the  minimvun  of  the  error  criterion. 

It  should  be  noted  that  Widrow's  nonrecursive  adaptive  filtering 
algorithm  gives  the  global  minimum  of  performance  criterion  due 
to  the  fact  that  for  the  stationary  process,  the  mean  square 
error  is  the  quadratic  form  of  weight  vectors,  but  for  the  re- 
cursive adaptive  filter,  local  minima  may  be  found  instead  of  the 
global  minimum.  The  computer  simulation  shows  that  for  the  examples 
considered  here  the  algorithms  presented  learn  the 
statistics  of  signal  and  adapt.  Several  points  can  be  observed 
through  the  experimental  results  [see  Figure  R-1  through  R-12] . 

1)  All  the  algorithms  presented  here  give  a satisfactory 

» 

result  after  the  transients  die  out  even  though  they  are 
not  optimal. 
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2)  The  algorithm  which  employs  the  finite  point  average 
square  error  for  the  gradient  estimates  gives  more 
rapid  convergence  than  Feintuck's  algorithm.  The 
possible  reasons  may  be  due  to  the  fact  that  the  output 
information  is  fed  back  and  used  for  the  filter  coef- 
ficients updating  process  and  the  sensitivity  information 
propagates  through  the  recursive  equation  as  the  iteration 
proceeds,  while  Feintuck's  algorithm  discards  the  sensi- 
tivity information. 

3)  The  algorithm  developed  here  gives  the  best  results  among 
the  various  algorithms  presented  at  the  expense  of 
complex  hardware.  Note  that  the  Dsquired  number  of  addi- 
tional sensitivity  filters  (equations  (3-47),  (3-48)) 
would  be  the  number  of  filter  coefficients,  and  due  to  the 
L point  averaging  process,  additional  storage  elements  are 
also  needed.  The  possible  reason  for  the  good  results 
may  be  due  to  the  fact  that  the  averaging  process  [equation 
(3-40)]  for  the  gradient  estimate  gives  a smaller 

error  between  true  gradient  and  estimated 
gradients  than  the  gradient  estimate  based  on  instantaneous 
square  error  does,  while  both  give  unbiased  gradient 
estimates . 

Due  to  the  emerging  interest  in  adaptive  recursive 
filters,  further  research  on  this  subject  may  be  worthwhile. 
The  following  are  left  open  for  further  research; 
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1)  Comparison  of  steady  state  performance  of  the  recur- 
sive adaptive  filter  with  the  Kalman  filtering  tech- 
nique. It  would  lead  to  a better  understanding  of  the 
performance  of  the  recursive  filter  to  express  the 
filter  coefficients  ^ in  terms  of  steady-state  Kalman 
filter  gains. 

2)  Mathematical  derivation  of  the  bound  in  step  size  of 
the  filter  coefficient  updating  process  for  convergence 
and  stability.  It  is  believed  that  this  bound  may  be  at- 
tained by  setting  up  the  constraints  first  such  that  the 
value  of  performance  criterion  decreases  mono- 
tomically  to  a minimum  as  the  iteration  progresses. 

3)  Modification  of  the  algorithms  for  the  case  that  partial 
statistics  of  signal  or  noise  are  available  apriori. 

4 ) Derivation  of  the  algorithm  based  on  a different  per- 
formance criterion  such  as  maximum  likelihood  ratio, 
maximum  signal  to  noise  ratio,  etc. 

5)  Derivation  of  the  algorithm  based  on  the  different  mini- 
mization techniques  such  as  Newton's  method  or  Fletcher- 
Powell  methods,  etc.,  for  a given  performance  criterion. 
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